~savoy/holo#1: 
queries passed to Data.from_sql require columns listed in `id_hash` in schemas to be present

When attempting to create a more limited query e.g. by pulling from the People table with just phone numbers and excluding name and email, a nested exception is raised through polars:

SELECT peo.id, peo.phoneNumber AS peopleNumber, cam.phoneNumber AS campaignNumber FROM People peo LEFT JOIN Campaigns cam ON peo.campaignContactId = cam.id WHERE cam.name=$CAMPAIGN_NAME
Unexpected exception formatting exception. Falling back to standard exception
Traceback (most recent call last):
  File "/home/savoy/.cache/pypoetry/virtualenvs/holo-WR-n4JgG-py3.9/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3505, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-119-5af4ca4d567c>", line 1, in <module>
    data = holo_sms.send(cfg, "hi", campaigns=[$CAMPAIGN_NAME])
  File "/home/savoy/dev/psl/holonet/holo/src/holo/holo_sms.py", line 208, in send
    data = Data.from_sql(cfg, queries)
  File "/home/savoy/dev/psl/holonet/holo/src/holo/holo_data.py", line 356, in from_sql
  File "/home/savoy/dev/psl/holonet/holo/src/holo/holo_data.py", line 215, in __init__
    self.frames = self._convert_tables_to_frames(data, type_match)
  File "/home/savoy/dev/psl/holonet/holo/src/holo/holo_data.py", line 426, in _convert_tables_to_frames
    frames.update(
  File "/home/savoy/dev/psl/holonet/holo/src/holo/holo_data.py", line 553, in _convert_sql_results_to_frames
    else:
  File "/home/savoy/dev/psl/holonet/holo/src/holo/holo_data.py", line 565, in _create_hashed_id
    if table.id_hash:
  File "/home/savoy/.cache/pypoetry/virtualenvs/holo-WR-n4JgG-py3.9/lib/python3.9/site-packages/polars/dataframe/frame.py", line 6798, in with_columns
    self.lazy()
  File "/home/savoy/.cache/pypoetry/virtualenvs/holo-WR-n4JgG-py3.9/lib/python3.9/site-packages/polars/lazyframe/frame.py", line 1475, in collect
    return wrap_df(ldf.collect())
exceptions.ColumnNotFoundError: firstName

Error originated just after this operation:
DF ["id", "peopleNumber", "campaignNumber"]; PROJECT */3 COLUMNS; SELECTION: "None"

It would seem that the current handling of creating Data, by checking for a unique ID, will fail if the columns required for it are unavailable, even if the column id is already included in the query.

If the row id is present, the id_hash columns should not be required in the query.

Status
REPORTED
Submitter
~savoy
Assigned to
Submitted
1 year, 6 months ago
Updated
1 year, 6 months ago
Labels
bug