feat: collect once during display() in jupyter notebooks by timsaucer · Pull Request #1167 · apache/datafusion-python

timsaucer · 2025-06-23T14:38:06Z

Which issue does this PR close?

None

Rationale for this change

By design in a Jupyter notebook display() calls both __repr__ and _repr_html_. This currently causes collect() on DataFrames to occur twice, which can lead to double the execution time during evaluation. This PR causes collect to only happen once.

What changes are included in this PR?

If we are in a jupyter notebook, we will cache the result of a __repr__ or _repr_html_ call. When the other call happens, it will consume the cached calls. This means that for display() in a jupyter notebook the collected data will be freed.

Are there any user-facing changes?

None.

alamb

Makes sense to me

kylebarron

I don't think this is a reasonable workaround because there are many Jupyter-protocol frontends that do not support displaying HTML output. This means that repr would be broken for the IPython console, for example

kylebarron · 2025-06-23T22:09:49Z

By design in a Jupyter notebook display() calls both __repr__ and _repr_html_.

Ref https://discourse.jupyter.org/t/find-out-if-my-code-runs-inside-a-notebook-or-jupyter-lab/6935/8

You might be able to look in the IPython config to see what's running, but this answer is 10+ years old and might've changed https://stackoverflow.com/a/24937408

timsaucer · 2025-06-24T00:38:28Z

I don't think this is a reasonable workaround because there are many Jupyter-protocol frontends that do not support displaying HTML output. This means that repr would be broken for the IPython console, for example

Thanks for the feedback! I changed to check for the environment as you suggested and tested in jupyter, ipython console, and regular python console.

kylebarron · 2025-06-24T01:07:17Z

As mentioned in a comment on SO, that fails in jupyter console, and I verified that still fails:

kylebarron

I think that looks a lot more stable, even though it's kinda annoying

… between jupyter console and notebook

…n we have any kind if ipython environment.

alamb · 2025-06-25T19:26:32Z

🎉 thank you @timsaucer and @kylebarron

timsaucer mentioned this pull request Jun 23, 2025

Rerun PRs rerun-io/opensource#2

Open

timsaucer self-assigned this Jun 23, 2025

alamb approved these changes Jun 23, 2025

View reviewed changes

kylebarron suggested changes Jun 23, 2025

View reviewed changes

timsaucer marked this pull request as draft June 24, 2025 00:25

kylebarron approved these changes Jun 24, 2025

View reviewed changes

timsaucer marked this pull request as ready for review June 25, 2025 11:27

timsaucer added 4 commits June 25, 2025 08:21

Only collect one time during display() in jupyter notebooks

1c80dcd

Check for juypter notebook environment specifically

00a2bb4

Remove approach of checking environment which could not differentiate…

14e8efa

… between jupyter console and notebook

Instead of trying to detect notebook vs console, collect one time whe…

ae65240

…n we have any kind if ipython environment.

timsaucer force-pushed the feat/collect-once-in-jupyter-notebook branch from 8d65e99 to ae65240 Compare June 25, 2025 13:02

timsaucer merged commit 9545634 into apache:main Jun 25, 2025
17 checks passed

timsaucer deleted the feat/collect-once-in-jupyter-notebook branch June 25, 2025 15:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: collect once during display() in jupyter notebooks#1167

feat: collect once during display() in jupyter notebooks#1167
timsaucer merged 4 commits intoapache:mainfrom
timsaucer:feat/collect-once-in-jupyter-notebook

timsaucer commented Jun 23, 2025 •

edited

Loading

Uh oh!

alamb left a comment

Uh oh!

kylebarron left a comment

Uh oh!

kylebarron commented Jun 23, 2025

Uh oh!

timsaucer commented Jun 24, 2025

Uh oh!

kylebarron commented Jun 24, 2025

Uh oh!

kylebarron left a comment

Uh oh!

Uh oh!

alamb commented Jun 25, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

timsaucer commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

kylebarron left a comment

Choose a reason for hiding this comment

Uh oh!

kylebarron commented Jun 23, 2025

Uh oh!

timsaucer commented Jun 24, 2025

Uh oh!

kylebarron commented Jun 24, 2025

Uh oh!

kylebarron left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alamb commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

timsaucer commented Jun 23, 2025 •

edited

Loading

alamb commented Jun 25, 2025 •

edited

Loading