-
Notifications
You must be signed in to change notification settings - Fork 354
Open
Labels
bugSomething isn't workingSomething isn't workingp0Priority 0 - to be addressed immediatelyPriority 0 - to be addressed immediately
Description
Describe the bug
Error when using default classify_text
To Reproduce
import daft
df = daft.read_huggingface("yelp/yelp_review_full")
df = df.with_column("label", daft.functions.classify_text(df["text"], labels=["positive", "negative"]))
df = df.write_parquet("~/Data/Yelp/yelp_full_review/")
df.collect()F __call__-f46d2f68-5794-4ad6-b7cb-a6954edbc270
Traceback (most recent call last):
File "/Users/corygrinstead/Development/Daft/data/scratch.py", line 14, in <module>
df = df.write_parquet("~/Data/Yelp/yelp_full_review_embedded/")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/daft/dataframe/dataframe.py", line 814, in write_parquet
write_df.collect()
File "/Users/corygrinstead/Development/Daft/daft/dataframe/dataframe.py", line 4051, in collect
self._materialize_results()
File "/Users/corygrinstead/Development/Daft/daft/dataframe/dataframe.py", line 4014, in _materialize_results
self._result_cache = get_or_create_runner().run(self._builder)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/daft/runners/native_runner.py", line 77, in run
results = list(self.run_iter(builder))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/daft/runners/native_runner.py", line 134, in run_iter
raise e
File "/Users/corygrinstead/Development/Daft/daft/runners/native_runner.py", line 118, in run_iter
for result in results_gen:
File "/Users/corygrinstead/Development/Daft/daft/execution/native_executor.py", line 70, in run
event_loop.run(async_exec.aclose())
File "/Users/corygrinstead/Development/Daft/daft/event_loop.py", line 28, in run
return asyncio.run_coroutine_threadsafe(future, self.loop).result()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/.local/share/uv/python/cpython-3.11.13-macos-aarch64-none/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/.local/share/uv/python/cpython-3.11.13-macos-aarch64-none/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/Users/corygrinstead/Development/Daft/daft/execution/native_executor.py", line 59, in stream_results
_ = await result_handle.finish()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/daft/udf/execution.py", line 83, in call_batch_func
bound_method = cls._daft_bind_method(method)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/daft/udf/udf_v2.py", line 319, in _daft_bind_method
local_instance = self._daft_get_instance()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/daft/udf/udf_v2.py", line 377, in _daft_get_instance
self._daft_local_instance = cls(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/daft/ai/_expressions.py", line 70, in __init__
self.text_classifier = text_classifier.instantiate()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/daft/ai/transformers/protocols/text_classifier.py", line 51, in instantiate
return TransformersTextClassifier(self.model_name, **self.model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/daft/ai/transformers/protocols/text_classifier.py", line 69, in __init__
self._pipeline = pipeline(
^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/transformers/pipelines/__init__.py", line 1210, in pipeline
return pipeline_class(model=model, framework=framework, task=task, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/transformers/pipelines/zero_shot_classification.py", line 92, in __init__
super().__init__(*args, **kwargs)
File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/transformers/pipelines/base.py", line 1043, in __init__
self.model.to(self.device)
File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4346, in to
return super().to(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1369, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 928, in _apply
module._apply(fn)
File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 928, in _apply
module._apply(fn)
File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 955, in _apply
param_applied = fn(param)
^^^^^^^^^
File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1362, in convert
raise NotImplementedError(
NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
Expected behavior
classifies text as expected
Component(s)
Built-in Functions
Additional context
Note, it doesn't always error, but more often than not it gives this error.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingp0Priority 0 - to be addressed immediatelyPriority 0 - to be addressed immediately