German Electricity Data :electric_plug:¶

  • Data Source: SMARD Web Portal
  • Task: forecasting (using the past to predict the future)
  • Language: julia

First steps:¶

  • load the data (with appropriate column types)
  • split the data into train/val/test sets
  • plot the time series (at different temporal resolution)
  • calculate the mean production per month
  • is there a seasonal pattern (what kinds of seasonalities)
  • make a prediction for the validation set
  • evaluate the quality of the prediction
In [1]:
using Random
In [2]:
Random.seed!(42)
"Jörg Marko Sascha Chaitanya Sonia Robin Andi" |> split |> shuffle |> x -> join(x," -> ")
Out[2]:
"Jörg -> Sascha -> Robin -> Andi -> Chaitanya -> Marko -> Sonia"

1. Data loading¶

  • load the hourly data from the file Realisierte_Erzeugung_201501010000_202305301300_Stunde.csv as a table into julia
  • make sure, that data is interpreted and typed correctly
  • clean header names
In [54]:
using CSV, DataFrames, Dates
In [55]:
function string_parse(strg)
    if ismissing(strg) 
        return strg
    else
    string = replace(strg, "." => "")
    newstring = replace(string, "," => ".")
    newnewstring = replace(newstring, "-" => missing)
    return newnewstring
    end
end
Out[55]:
string_parse (generic function with 1 method)
In [56]:
function str_vec_to_float(vec) 
    b = string_parse.(vec)
    c = Vector{Union{Missing, Float64}}(undef,0)
    for (i,x) in enumerate(b) 
        if !ismissing(x) 
            push!(c, parse(Float64, x))
        else
            push!(c,x)
        end
    end
    return c
end
Out[56]:
str_vec_to_float (generic function with 1 method)
In [57]:
df = DataFrame(CSV.File("Realisierte_Erzeugung_201501010000_202305301300_Stunde.csv", delim = ";"; missingstring = ["-"]))
head = names(df)
df[!,1] = Date.(df[!,1], "dd.mm.yyyy")
for i in head[4:end]
    df[!,i] = str_vec_to_float(df[!,i])
end
In [58]:
df
Out[58]:
73716×15 DataFrame
73691 rows omitted
RowDatumAnfangEndeBiomasse [MWh] Berechnete AuflösungenWasserkraft [MWh] Berechnete AuflösungenWind Offshore [MWh] Berechnete AuflösungenWind Onshore [MWh] Berechnete AuflösungenPhotovoltaik [MWh] Berechnete AuflösungenSonstige Erneuerbare [MWh] Berechnete AuflösungenKernenergie [MWh] Berechnete AuflösungenBraunkohle [MWh] Berechnete AuflösungenSteinkohle [MWh] Berechnete AuflösungenErdgas [MWh] Berechnete AuflösungenPumpspeicher [MWh] Berechnete AuflösungenSonstige Konventionelle [MWh] Berechnete Auflösungen
DateTimeTimeFloat64?Float64?Float64?Float64?Float64?Float64?Float64?Float64?Float64?Float64?Float64?Float64?
12015-01-0100:00:0001:00:004024.251158.25516.58128.00.0133.010710.515687.23219.751226.251525.754909.25
22015-01-0101:00:0002:00:003982.751188.0516.258297.50.0122.511086.215321.82351.25870.751079.254932.75
32015-01-0102:00:0003:00:004019.51139.25514.08540.00.093.011026.214817.52227.0809.5787.05041.75
42015-01-0103:00:0004:00:004040.751122.5517.758552.00.086.511027.814075.02339.75821.0287.755084.0
52015-01-0104:00:0005:00:004037.751112.0519.758643.50.086.510962.214115.02461.5831.25346.755070.75
62015-01-0105:00:0006:00:004028.251107.75520.08711.750.086.7510696.013474.22217.75851.0765.55096.75
72015-01-0106:00:0007:00:004013.251111.75521.59167.250.087.010299.512403.82373.25868.25414.55153.0
82015-01-0107:00:0008:00:004012.751113.75520.259811.00.087.010035.212062.52491.0876.0582.55161.0
92015-01-0108:00:0009:00:003999.751107.5525.259683.053.087.010245.812405.02530.25888.25750.55393.5
102015-01-0109:00:0010:00:004016.251121.0527.09501.75773.2585.7510060.212798.82386.25891.5387.05884.0
112015-01-0110:00:0011:00:004007.751122.0525.010025.02116.7582.010082.212728.52557.0888.25176.256064.0
122015-01-0111:00:0012:00:004011.751146.25527.2510862.53364.2582.09949.012451.02657.75876.5884.55914.75
132015-01-0112:00:0013:00:004014.01139.75527.7511575.24198.2582.010119.012464.52937.75874.25705.05778.25
⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮
737052023-05-3001:00:0002:00:004581.52033.753891.7511488.50.0119.50.03523.51288.53984.75769.251146.0
737062023-05-3002:00:0003:00:004551.01963.53725.510141.00.0119.00.03484.251263.753592.0604.251141.75
737072023-05-3003:00:0004:00:004503.252090.53635.259170.250.0119.00.03490.01272.54030.5578.751132.5
737082023-05-3004:00:0005:00:004454.252105.253635.758735.752.25120.00.03678.251275.04895.5433.251136.0
737092023-05-3005:00:0006:00:004406.752134.752940.08419.75407.75120.250.04087.51289.255723.75404.251142.25
737102023-05-3006:00:0007:00:004413.252227.52272.57531.02888.75128.250.05010.751362.06525.252803.751144.5
737112023-05-3007:00:0008:00:004468.02276.51916.255507.258343.25137.00.05275.51173.756905.755712.751156.75
737122023-05-3008:00:0009:00:004503.52152.51643.253647.7515893.2138.00.05181.751196.256946.55079.51132.0
737132023-05-3009:00:0010:00:004544.52156.251285.02900.523612.5135.00.04707.51114.756384.52122.751115.0
737142023-05-3010:00:0011:00:004568.252085.251241.252729.530341.8121.250.03978.51095.54802.51279.01096.25
737152023-05-3011:00:0012:00:004618.752068.251053.253115.035196.2119.00.03759.751151.53762.25774.01096.5
737162023-05-3012:00:0013:00:004612.251967.25781.253496.2537030.0119.00.03765.01296.753240.0196.751099.75
In [0]: