Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for pulling data on renewable energy from the Marktstammdatenregister #29

Open
curiousleo opened this issue Mar 25, 2022 · 5 comments

Comments

@curiousleo
Copy link

curiousleo commented Mar 25, 2022

The commit messages on this repo are in English, so I'm writing this issue in English as well.

Problem: renewable energy statistics

This repo contains a summary of renewable energy data organised by Gemeindeschlüssel in CSV format:

https://github.com/GermanZero-de/localzero-data-public/blob/main/renewable_energy/2018.csv

The README explains how this data was gathered:

This data originates in the "Marktstammdatenregister" [...] This summation and bucketing to all ags keys was done in excel and needs to be repeated in python soon as the we choose an ags ist for 2021 instead of 2018 (We will provide a script as soon as this is done.).

It seems like both the initial data gathering and the summation potentially involved a lot of work and may have been time consuming. It also sounds like there is an effort underway to automate the procedure.

Context: Marktstammdatenregister.dev

I created Marktstammdatenregister.dev to make it easy to query the Marktstammdatenregister. Summary statistics of the kind used in this repo are exactly the project's raison d'être.

Here's a preview of the relevant query that presents the summary statistics in the format used in the existing CSV file.

You'll see that it is missing statistics for solar power. I removed that part because the full query times out on the low-powered machine that runs the website.

However, almost any recent computer can give the full result quickly using the underlying database file, see the next section.

Idea: use SQLite export directly

In short (using localzero.sql):

# Download and decompress the SQLite export.
# Takes less than 30 seconds with a fast internet connection.
# Marktstammdatenregister.db.gz: ~800 MB
# Marktstammdatenregister.db:    ~2.5 GB
curl https://s3.eu-central-1.wasabisys.com/mastr-backup/Marktstammdatenregister.db.gz \
  | gunzip - >Marktstammdatenregister.db

# Generate the CSV file.
# Takes less than five seconds with an SSD and recent CPU.
curl -sSLo localzero.sql https://gist.githubusercontent.com/curiousleo/7940a6b03a3e0ccf7ad89d6334d678e8/raw/e77d9f8b4622c4e645ea00ce1a8887a90ff26f92/localzero.sql
sqlite3 --readonly --csv --header \
  Marktstammdatenregister.db \
  <localzero.sql \
  >2018.csv

The database file that can be queried on https://ds.markstammdatenregister.dev can also be downloaded for local use. That's what the above snippet does.

We can then run the full query, including solar power, locally. With the right command-line arguments, SQLite outputs a CSV file with the format used in the existing file 2018.csv.

To calculate the statistics for any other date, just edit the first line of localzero.sql, for example:

-with Stichtag as (select "2018-12-31" as Stichtag),
+with Stichtag as (select "2021-12-31" as Stichtag),

Note: results change over time

Many renewable energy units are registered after they've started operating, sometimes years later -- see this query.

So even a historical query like "how much solar power was generated in Buxtehude on 2018-12-31" may give different results depending on which version of the Markstammdatenregister is used.

Conclusion

This is what I built Markstammdatenregister.dev for. I think there's a good chance that it can save you time and effort.

Feel free to try the suggestions and to have a look around the data and the examples.

If you have any questions, I'm happy to help. It is also definitely possible that I made a mistake in writing the SQL query or misunderstood something about the summary statistics needed for LocalZero -- if you spot anything that looks off, let me know!

@bgrundmann
Copy link
Collaborator

Very cool. Exciting to see that others are working on this. Thanks. I myself are taking this week of to recharge after the sprint to make the release. @Ultramarin95 do you want to have a look?

@Ultramarin95
Copy link
Contributor

Witzigerweise habe ich genau den Export aus dem Marktstammdaten Register grade diese Woche geschrieben. ( Siehe GermanZero-de/localzero-generator-core#200 ) Ich werde mal die beiden "2018.csv" Versionen vergleichen und mir das Tool etwas genauer anschauen @curiousleo.

@curiousleo
Copy link
Author

@bgrundmann said:

Very cool. Exciting to see that others are working on this.

Same here! Thanks for your work on LocalZero!

@curiousleo
Copy link
Author

@Ultramarin95 said:

Witzigerweise habe ich genau den Export aus dem Marktstammdaten Register grade diese Woche geschrieben.

Nice. Ich habe entsprechend dem verlinkten Skript die SQL-Abfrage aktualisiert (Brutto -> Nettoleistung, Summe als "DG000000").

In dem Skript geht es viel um AGS, die auch aus einer externen Quelle geholt werden -- den Teil verstehe ich nicht ganz, deshalb habe ich, was das angeht, erstmal nichts an der vorgeschlagenen SQL-Abfrage geändert.

@curiousleo
Copy link
Author

An der Stelle sollte https://github.com/OpenEnergyPlatform/open-MaStR nicht unerwähnt bleiben. Ich habe es selbst aber noch nicht verwendet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants