Warning: Some posts on this platform may contain adult material intended for mature audiences only. Viewer discretion is advised. By clicking ‘Continue’, you confirm that you are 18 years or older and consent to viewing explicit content.
I raise you thousands of gzipped files (total > 20GB) combined into one dataframe. Frankly, my work laptop did not like it all that much. But most basic operations still worked fine tho
Yeah, it was just a simple example. Although using just pandas (without something like dask) for loading terabytes of data at once into a single dataframe may not be the best idea, even with enough memory.
You havent seen anything until you need to put a 4.2gb gzipped csv into a pandas dataframe, which works without any issues I should note.
I raise you thousands of gzipped files (total > 20GB) combined into one dataframe. Frankly, my work laptop did not like it all that much. But most basic operations still worked fine tho
I really don’t think that’s a lot either. Nowadays we routinely process terabytes of data.
Yeah, it was just a simple example. Although using just pandas (without something like dask) for loading terabytes of data at once into a single dataframe may not be the best idea, even with enough memory.
It’s good to see the occult is still alive and well