WIP: Enhance data viewer #619

danielbasso · 2021-04-11T06:11:50Z

What problem did you solve?

This is a proof of concept for using the jupyter notebook api for data viewing, as discussed in #506. Be aware it makes jupyter extension a requisite for vscode-R.

How can I check this pull request?

Try this code:


normal_tab_size  <- 100000

my_df_normal  <-  data.frame(
                  double1 = rnorm(normal_tab_size), 
                  double2 = rnorm(normal_tab_size), 
                  integer1 = 1L:normal_tab_size,
                  logical1 = ifelse(1L:normal_tab_size%%2 == 0, TRUE, FALSE),
                  string1 = ifelse(1L:normal_tab_size%%2 == 0, "lalala", "lololo"))                  

huge_tab_size <- 2000000

my_df_huge  <-  data.frame(
                  double1 = rnorm(huge_tab_size), 
                  double2 = rnorm(huge_tab_size), 
                  integer1 = 1L:huge_tab_size,
                  logical1 = ifelse(1L:huge_tab_size%%2 == 0, TRUE, FALSE),
                  string1 = ifelse(1L:huge_tab_size%%2 == 0, "lalala", "lololo"))  

View(my_df_normal) # 9mb ~ 1 sec
View(my_df_huge) # 173mb ~ 25 sec - Very responsive  after loading

my_matrix <- matrix(c(1,2,3,4,5,6,7,8,9), 3, 3)

View(my_matrix)

Screenshot

It loads the data by pieces, but attempts to load all data (couldn't figure out how to stop this behavior). So, during loading, various call to R are made to create the appropriate files:

Most data frames will load pretty fast, only when there is a lot of data it can take longer. The cool thing is that the viewer displays a "loading status" cue and after it finishes, it's usually smooth to navigate. Rarely, for big data frames, it hangs, just close and try again.

Files are created accordingly:

Closing one viewer automatically deletes its associated files.

Filters don't do much: you can filter for partial matches on strings and numbers; logical columns I couldn't figure out how the filter works; numeric columns will filter for 0 when you pass illogical arguments.

danielbasso · 2021-04-12T14:28:43Z

Hi,

I did edit the PR text as reading again the last phrase sounded kind rude heh. And I also think I was not clear enough about what I was trying to achieve with it. So, to clarify my intentions:

As this implementation is not a new feature, but rather a replacement for something that already exists, I'm trying to be very careful on how to approach it. My goal is to receive feedback, so feel free to criticize and suggest improvements. Maybe @renkun-ken can express his opinion, as he created the data viewer.

My personal critic is that I'm not 100% satisfied with the typescript calls to R, through rTerminal.runCommand, being seen by the user. I couldn't figure out a way to make the call "invisible", using other session, terminal, etc. I could use some help in this regard.

Also, I'm assuming that squashing commits and resolving conflicts of this PR is up the project owner, as I fell some commits behind master after I started. Please let me know if this should be my responsibility. If latter is the case, sorry for any inconvenience.

acarril · 2021-07-15T22:01:25Z

@danielbasso can I ask what happened to this? I was looking forward to it.

Edit: I see this functionality might be superseded by #708 ? However, I understand that that implementation of ag-grid doesn't not do dynamic loading, which IMO was the killer feature of your implementation.
Edit 2: However, it seems ag-grid per se is capable of dynamic loading under some "row models" (see here), although this only seems available in the "enterprise" version.

renkun-ken · 2021-07-15T22:21:32Z

I don't implement a dynamic loading mechanism similar to this in #708 (which might be supported by ag-grid-community infinite row model) since the filtering and sorting might be confusing as it does not load all data and the way of sending command to terminal to load data always seems problematic if the terminal is busy or user already typed some text.

For future reference, I might prefer a mechanism of background connection (e.g. enabled by non-blocking websocket and httpuv), or the interactive window talks with a fully managed R session (suggested in #700) so that it is much easier and safer to let the session do things as supposed.

acarril · 2021-07-15T22:24:33Z

@renkun-ken I see. Thank you for the explanation, and thank you for your work on this feature!

danielbasso added 4 commits April 8, 2021 16:25

first working version

b848e27

Better loading, dispose delete files, and more

2f9e9aa

Slight refactor

ee36f74

remove launch,json garbage

a8efb21

danielbasso closed this Jul 15, 2021

danielbasso deleted the enhance-data-viwer-proposal branch July 15, 2021 20:37

eitsupi mentioned this pull request Jan 19, 2022

Add row limit setting of data viewer and support Apache Arrow Table #945

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Enhance data viewer #619

WIP: Enhance data viewer #619

danielbasso commented Apr 11, 2021 •

edited

Loading

danielbasso commented Apr 12, 2021

acarril commented Jul 15, 2021 •

edited

Loading

renkun-ken commented Jul 15, 2021

acarril commented Jul 15, 2021

WIP: Enhance data viewer #619

WIP: Enhance data viewer #619

Conversation

danielbasso commented Apr 11, 2021 • edited Loading

danielbasso commented Apr 12, 2021

acarril commented Jul 15, 2021 • edited Loading

renkun-ken commented Jul 15, 2021

acarril commented Jul 15, 2021

danielbasso commented Apr 11, 2021 •

edited

Loading

acarril commented Jul 15, 2021 •

edited

Loading