Tutorial
Pandas vs Polars: practical example
"Kernel died" when using pandas; that's when polars comes in handy.
If we try to load all 139,310,467 rows of historical data from the Resultado de la Programación Horaria del PDBF (I90) files, spanning from 2014 to the latest available data, we get the results shown in the image when using two different libraries: pandas and polars.
The I90 files, part of Spain’s energy market operations, provide detailed insights into the Resultado de la Programación Horaria del PDBF—the hourly scheduling results of the Daily Base Operating Program (PDBF). These schedules describe how energy production units are planned to operate, ensuring an efficient balance between supply and demand while meeting regulatory and market constraints. They are an essential tool for understanding and analyzing the operational intricacies of the Spanish electricity system.
Although we love using pandas, sometimes it’s not sufficient to handle such a vast volume of rows, as demonstrated in this case. Libraries like polars can provide alternative solutions for working with large datasets more efficiently.
If you want to implement best practices for your data processes or learn how to manage and analyze large datasets effectively, contact us for personalized training tailored to your needs.
