This repository was archived by the owner on Jul 22, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 25
This repository was archived by the owner on Jul 22, 2024. It is now read-only.
Serialization fails for some users #17
Copy link
Copy link
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Some users are seeing the following issue when the ETL job tries to serialize the results to parquet:
(base) ➜ orbit_prediction git:(master) python3 orbit_prediction/spacetrack_etl.py --st_user justin.smithcastro@gmail.com --st_password <mypassword> --norad_id_file sample_data/test_norad_ids.txt --past_n_days 10 --output_path outputfile
INFO:__main__:Fetching Satellite Catalog Data...
INFO:__main__:Number of TLE Batch Requests: 1
INFO:__main__:Starting to fetch TLEs from space-track.org
INFO:__main__:Processing batch 1/1
INFO:__main__:Fetching TLEs for 20 ASOs...
INFO:__main__:Parsing raw TLE data...
INFO:__main__:Finished fetching TLEs
INFO:__main__:Calculating orbital state vectors for 372 TLEs...
INFO:__main__:Serializing data...
Traceback (most recent call last):
File "orbit_prediction/spacetrack_etl.py", line 309, in <module>
orbit_data_df.to_parquet(args.output_path)
File "/usr/local/anaconda3/lib/python3.8/site-packages/pandas/util/_decorators.py", line 214, in wrapper
return func(*args, **kwargs)
File "/usr/local/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py", line 2109, in to_parquet
to_parquet(
File "/usr/local/anaconda3/lib/python3.8/site-packages/pandas/io/parquet.py", line 260, in to_parquet
return impl.write(
File "/usr/local/anaconda3/lib/python3.8/site-packages/pandas/io/parquet.py", line 112, in write
self.api.parquet.write_table(
File "/usr/local/anaconda3/lib/python3.8/site-packages/pyarrow/parquet.py", line 1733, in write_table
writer.write_table(table, row_group_size=row_group_size)
File "/usr/local/anaconda3/lib/python3.8/site-packages/pyarrow/parquet.py", line 591, in write_table
self.writer.write_table(table, row_group_size=row_group_size)
File "pyarrow/_parquet.pyx", line 1433, in pyarrow._parquet.ParquetWriter.write_table
File "pyarrow/error.pxi", line 84, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Casting from timestamp[ns] to timestamp[ms] would lose data: 1602624833909568000
It looks to be an issue with pyarrow and this stackoverflow thread may provide a solution.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working