Skip to content

classify_text: NotImplementedError: Cannot copy out of meta tensor; no data! #5707

@universalmind303

Description

@universalmind303

Describe the bug

Error when using default classify_text

To Reproduce

import daft

df = daft.read_huggingface("yelp/yelp_review_full")
df = df.with_column("label", daft.functions.classify_text(df["text"], labels=["positive", "negative"]))
df = df.write_parquet("~/Data/Yelp/yelp_full_review/")

df.collect()
F __call__-f46d2f68-5794-4ad6-b7cb-a6954edbc270
Traceback (most recent call last):
  File "/Users/corygrinstead/Development/Daft/data/scratch.py", line 14, in <module>
    df = df.write_parquet("~/Data/Yelp/yelp_full_review_embedded/")
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/daft/dataframe/dataframe.py", line 814, in write_parquet
    write_df.collect()
  File "/Users/corygrinstead/Development/Daft/daft/dataframe/dataframe.py", line 4051, in collect
    self._materialize_results()
  File "/Users/corygrinstead/Development/Daft/daft/dataframe/dataframe.py", line 4014, in _materialize_results
    self._result_cache = get_or_create_runner().run(self._builder)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/daft/runners/native_runner.py", line 77, in run
    results = list(self.run_iter(builder))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/daft/runners/native_runner.py", line 134, in run_iter
    raise e
  File "/Users/corygrinstead/Development/Daft/daft/runners/native_runner.py", line 118, in run_iter
    for result in results_gen:
  File "/Users/corygrinstead/Development/Daft/daft/execution/native_executor.py", line 70, in run
    event_loop.run(async_exec.aclose())
  File "/Users/corygrinstead/Development/Daft/daft/event_loop.py", line 28, in run
    return asyncio.run_coroutine_threadsafe(future, self.loop).result()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/.local/share/uv/python/cpython-3.11.13-macos-aarch64-none/lib/python3.11/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/.local/share/uv/python/cpython-3.11.13-macos-aarch64-none/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/Users/corygrinstead/Development/Daft/daft/execution/native_executor.py", line 59, in stream_results
    _ = await result_handle.finish()
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/daft/udf/execution.py", line 83, in call_batch_func
    bound_method = cls._daft_bind_method(method)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/daft/udf/udf_v2.py", line 319, in _daft_bind_method
    local_instance = self._daft_get_instance()
                     ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/daft/udf/udf_v2.py", line 377, in _daft_get_instance
    self._daft_local_instance = cls(*args, **kwargs)
                                ^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/daft/ai/_expressions.py", line 70, in __init__
    self.text_classifier = text_classifier.instantiate()
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/daft/ai/transformers/protocols/text_classifier.py", line 51, in instantiate
    return TransformersTextClassifier(self.model_name, **self.model_options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/daft/ai/transformers/protocols/text_classifier.py", line 69, in __init__
    self._pipeline = pipeline(
                     ^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/transformers/pipelines/__init__.py", line 1210, in pipeline
    return pipeline_class(model=model, framework=framework, task=task, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/transformers/pipelines/zero_shot_classification.py", line 92, in __init__
    super().__init__(*args, **kwargs)
  File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/transformers/pipelines/base.py", line 1043, in __init__
    self.model.to(self.device)
  File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4346, in to
    return super().to(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1369, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 928, in _apply
    module._apply(fn)
  File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 928, in _apply
    module._apply(fn)
  File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 955, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/Users/corygrinstead/Development/Daft/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1362, in convert
    raise NotImplementedError(
NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

Expected behavior

classifies text as expected

Component(s)

Built-in Functions

Additional context

Note, it doesn't always error, but more often than not it gives this error.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingp0Priority 0 - to be addressed immediately

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions