Pandas provides the function read_sas to read the sas data. It supports two format: (1) ‘xport’ and (2) ‘sas7bdat’. Sometimes the data is really large and is provided in a compressed file. Unfortunately, it seems that pandas does not support reading from the compressed sas data directly.
In this post, we will introduce a not-quite-save way to read from the zipped sas data. In order to do so, we need three packages:
- gzip
- io
- pandas
f = gzip.GzipFile('your_input_file.sas7bdat.tar.gz', 'rb') df = pd.read_sas(f, format='sas7bdat')
So how to address this problem? There is a very straightforward way to handle this, but it is not very safe (this point will be explained later). The simple idea is try to skip the meta information header in the gz file.
Here is the code. In part 1, we overwrite the seek function of the gzip.GzipFile. The function takes into account the fact that we skip the header and when the seek function is called in the pd.read_sas function, an additional offset (of the header) is added to the original offset. In part 2, we just guess where the meta header information ends.
As mentioned previously, this method is not safe because when we overwrite the seek function, we do not take into account the whence parameter. Fortunately, this code can work (at least for some small files).
# part 1: change the "seek" behavior of the file object. class FileObjGZ(gzip.GzipFile): def set_gz_offset(self, val): self._gz_offset = val def seek(self, offset, whence=io.SEEK_SET): new_offset = offset + self._gz_offset super(FileObjGZ, new_offset).seek(new_offset) input_file = 'your_input_file.sas7bdat.tar.gz' # part 2: try to skip the meta header with FileObjGZ(input_file, 'rb') as f: guess_gz_offset = 0 while True: try: f.set_gz_offset(guess_gz_offset) f.seek(0) df = pd.read_sas(f, format='sas7bdat')
break except ValueError as e: print(e) guess_gz_offset += 1 print("loading data is completed.")
---END--
Smm panel
ReplyDeleteSmm panel
is ilanlari
İnstagram takipçi satın al
hırdavatçı
https://www.beyazesyateknikservisi.com.tr/
servis
tiktok jeton hilesi