-
Hi there, I'm trying to run the It is running inside kubernetes with 85G free diskspace and allowed up to 16G of memory. I am have tried to set the Any advice would be greatly appreciated, thanks in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 2 replies
-
If you are running with
Obviously those variables would be substituted with the real values in your environment You mention Kubernetes so the one thing that comes to mind is how are you invoking your script? Are you calling it from an entrypoint script in your container image, manually from a shell, via the Whichever of those it is I would make sure you are properly escaping the arguments because it's possible that Note that the script honours the value of the In general it would be useful to confirm exactly which version of Jena you are using and how you invoke the |
Beta Was this translation helpful? Give feedback.
-
Thanks for such a prompt reply @rvesse, your right, I should have stated the version etc. The version I am using is It's good to know it should honour the |
Beta Was this translation helpful? Give feedback.
-
I'm note sure what's in 3.4.0 . There are important security upgrades and fixes.
More work has gone into the TDB2 loader. As well as a xloader style loader, there is a multithreaded loader which is easier to use. |
Beta Was this translation helpful? Give feedback.
-
This is all great stuff thanks, gives me avenues to investigate. If I were to upgrade, we use both
If I were to upgrade, what would be the |
Beta Was this translation helpful? Give feedback.
-
Some further background. The commands all work fine when I use say 300 files, but fails when I use 1000 files, meaning more files to work with and more data, the total will be closer to 2000+ files. It consistently fails during the sorting, so it can only be memory or disk-space, if you disagree though, please just say. The actual error I see is: /opt/apache-jena-3.4.0/bin/tdbloader2index: line 306: 109143 Killed sort $SORT_ARGS -u $KEYS < "$DATA" > $WORK The logging I'm seeing is: 12:09:41 DEBUG JVM Arguments are -Xmx1200M
12:09:41 DEBUG Jena Classpath is /opt/apache-jena-3.4.0/lib/*
12:09:41 INFO Index Building Phase
12:09:41 DEBUG Sort Arguments: --buffer-size=50% --parallel=3
12:09:41 DEBUG Sort Temp Directory: /var/lib/wims-staging/
12:09:41 DEBUG Sort Temp Directory is on disk //fac25d2a92a4740cf8686bc.file.core.windows.net/pvc-69ec4bea-4244-4249-aa61-a27910cbc6a7 which has 70% free space (113054253056 bytes)
12:09:41 INFO Creating Index SPO
12:09:41 DEBUG Size of data to be sorted is 8902588356 bytes
12:09:41 DEBUG Sufficient free space on database drive //fac25d2a92a4740cf8686bc.file.core.windows.net/pvc-ade1ddd5-a53f-421b-92b2-cf81a2144b94 to attempt sorting data file /fuseki/databases/green/DS-DB//data-triples.tmp (8902588356 bytes required from 201771515904 bytes free)
12:09:41 DEBUG Should be sufficient free memory (14374588416 bytes) for sort to be fully in-memory
12:09:41 INFO Sort SPO
12:09:41 DEBUG Sorting /fuseki/databases/green/DS-DB//data-triples.tmp into work file /fuseki/databases/green/DS-DB//SPO-txt Suggesting it can do it in memory. Since the error is in the |
Beta Was this translation helpful? Give feedback.
-
That's approaching 14G. I think |
Beta Was this translation helpful? Give feedback.
-
Thanks guys, I did the first suggestion of setting it to 1G and it has worked, so maybe I just have to force disk-based sorting. I'll try a couple of times though just in case, I will also try the options @afs added as well. Definitely heading in the right direction though, thanks for all the help. |
Beta Was this translation helpful? Give feedback.
--sort-args=--temporary-directory=/var/lib/tmp
should be a valid option to the script AFAICTIf you are running with
--debug
then you should see some messages like the following:Obviously those variables would be substituted with the real values in your environment
You mention Kubernetes so the one thing that comes to mind is how are you invoking your script? Are you calling it from an entrypoint script in your container image, manually from a shell, via the
command
and/orargs
portions of your pod spec?Whichever of those it is I would make sure you are properl…