Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Enhance data viewer #619

Closed
wants to merge 4 commits into from
Closed

WIP: Enhance data viewer #619

wants to merge 4 commits into from

Conversation

danielbasso
Copy link
Contributor

@danielbasso danielbasso commented Apr 11, 2021

What problem did you solve?

This is a proof of concept for using the jupyter notebook api for data viewing, as discussed in #506. Be aware it makes jupyter extension a requisite for vscode-R.

How can I check this pull request?

Try this code:


normal_tab_size  <- 100000

my_df_normal  <-  data.frame(
                  double1 = rnorm(normal_tab_size), 
                  double2 = rnorm(normal_tab_size), 
                  integer1 = 1L:normal_tab_size,
                  logical1 = ifelse(1L:normal_tab_size%%2 == 0, TRUE, FALSE),
                  string1 = ifelse(1L:normal_tab_size%%2 == 0, "lalala", "lololo"))                  

huge_tab_size <- 2000000

my_df_huge  <-  data.frame(
                  double1 = rnorm(huge_tab_size), 
                  double2 = rnorm(huge_tab_size), 
                  integer1 = 1L:huge_tab_size,
                  logical1 = ifelse(1L:huge_tab_size%%2 == 0, TRUE, FALSE),
                  string1 = ifelse(1L:huge_tab_size%%2 == 0, "lalala", "lololo"))  

View(my_df_normal) # 9mb ~ 1 sec
View(my_df_huge) # 173mb ~ 25 sec - Very responsive  after loading

my_matrix <- matrix(c(1,2,3,4,5,6,7,8,9), 3, 3)

View(my_matrix)

Screenshot

image

It loads the data by pieces, but attempts to load all data (couldn't figure out how to stop this behavior). So, during loading, various call to R are made to create the appropriate files:

image

Most data frames will load pretty fast, only when there is a lot of data it can take longer. The cool thing is that the viewer displays a "loading status" cue and after it finishes, it's usually smooth to navigate. Rarely, for big data frames, it hangs, just close and try again.

Files are created accordingly:

image

Closing one viewer automatically deletes its associated files.

Filters don't do much: you can filter for partial matches on strings and numbers; logical columns I couldn't figure out how the filter works; numeric columns will filter for 0 when you pass illogical arguments.

@danielbasso
Copy link
Contributor Author

Hi,

I did edit the PR text as reading again the last phrase sounded kind rude heh. And I also think I was not clear enough about what I was trying to achieve with it. So, to clarify my intentions:

As this implementation is not a new feature, but rather a replacement for something that already exists, I'm trying to be very careful on how to approach it. My goal is to receive feedback, so feel free to criticize and suggest improvements. Maybe @renkun-ken can express his opinion, as he created the data viewer.

My personal critic is that I'm not 100% satisfied with the typescript calls to R, through rTerminal.runCommand, being seen by the user. I couldn't figure out a way to make the call "invisible", using other session, terminal, etc. I could use some help in this regard.

Also, I'm assuming that squashing commits and resolving conflicts of this PR is up the project owner, as I fell some commits behind master after I started. Please let me know if this should be my responsibility. If latter is the case, sorry for any inconvenience.

@danielbasso danielbasso deleted the enhance-data-viwer-proposal branch July 15, 2021 20:37
@acarril
Copy link

acarril commented Jul 15, 2021

@danielbasso can I ask what happened to this? I was looking forward to it.

Edit: I see this functionality might be superseded by #708 ? However, I understand that that implementation of ag-grid doesn't not do dynamic loading, which IMO was the killer feature of your implementation.
Edit 2: However, it seems ag-grid per se is capable of dynamic loading under some "row models" (see here), although this only seems available in the "enterprise" version.

@renkun-ken
Copy link
Member

I don't implement a dynamic loading mechanism similar to this in #708 (which might be supported by ag-grid-community infinite row model) since the filtering and sorting might be confusing as it does not load all data and the way of sending command to terminal to load data always seems problematic if the terminal is busy or user already typed some text.

For future reference, I might prefer a mechanism of background connection (e.g. enabled by non-blocking websocket and httpuv), or the interactive window talks with a fully managed R session (suggested in #700) so that it is much easier and safer to let the session do things as supposed.

@acarril
Copy link

acarril commented Jul 15, 2021

@renkun-ken I see. Thank you for the explanation, and thank you for your work on this feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants