Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1545648: PUT command error if file path contains spaces #996

Open
alex-sequentum opened this issue Jul 19, 2024 · 13 comments
Open

SNOW-1545648: PUT command error if file path contains spaces #996

alex-sequentum opened this issue Jul 19, 2024 · 13 comments
Assignees
Labels
bug status-fixed_awaiting_release The issue has been fixed, its PR merged, and now awaiting the next release cycle of the connector. status-triage_done Initial triage done, will be further handled by the driver team

Comments

@alex-sequentum
Copy link

I am using Snowflake.Data NuGet package v4.0.0, / .NET8.0 / Windows 11

My file path contains a space and I am following the following suggestion
"If the directory path or filename includes special characters or spaces, enclose the entire file URI in single quotes. " I have found here

Below is sample code

using (var conn = new SnowflakeDbConnection())  
                {
                    conn.ConnectionString = "account=XXXXX;user=XXXX;password=\"XXXXX*\";db=\"MY_TEST\";schema=PUBLIC;ROLE=ACCOUNTADMIN";
                    conn.Open();
                    var command = conn.CreateCommand();
                    command.CommandText = "PUT 'file://C:/Users/Alexander/AppData/Local/Temp/mypiei jvzgq/*.*' @MY4_TEST_STAGE auto_compress=true";
                    command.ExecuteNonQuery();
                }

If a file path contains a space, it will throw the following error.
{"Error: SQL compilation error: parse error line 1 at position 59 near ''. SqlState: 42000, VendorCode: 1003, QueryId: 01b5be10-0001-2942-0001-f7f20001601e"}

If a file path does not contain a space and I do not use single quotes, then everything works fine, for example
command.CommandText = "PUT file://C:/Users/Alexander/AppData/Local/Temp/mypiei_jvzgq/*.* @MY4_TEST_STAGE auto_compress=true";

@github-actions github-actions bot changed the title PUT command error if file path contains spaces SNOW-1545648: PUT command error if file path contains spaces Jul 19, 2024
@sfc-gh-dszmolka sfc-gh-dszmolka added status-triage Issue is under initial triage and removed bug labels Jul 22, 2024
@sfc-gh-dszmolka sfc-gh-dszmolka self-assigned this Jul 22, 2024
@sfc-gh-dszmolka
Copy link
Contributor

sfc-gh-dszmolka commented Jul 22, 2024

hey - i'm trying to reproduce this issue with the following setup for files; where both the directory name and the file name contains spaces:

c:\temp\directory name with spaces>dir
 Volume in drive C has no label.
 Volume Serial Number is 2291-20C4

 Directory of c:\temp\directory name with spaces

01/13/2022  07:57 PM    <DIR>          .
01/13/2022  07:57 PM    <DIR>          ..
09/21/2021  12:00 PM           487,731 001 dummy textfile with spaces.txt
09/21/2021  12:00 PM           487,731 abd dummy textfile with spaces.txt

repro script, with Snowflake .NET driver 4.0.0:

using System;
using System.Data;
using System.Data.Common;
using Snowflake.Data.Client;
namespace SnowflakeTestProgram
{
    class Program
    {
        static void Main(string[] args)
        {
            try
            {
                using (IDbConnection conn = new SnowflakeDbConnection())
                {
                    conn.ConnectionString = "account=...";
                    conn.Open();
                    Console.WriteLine("Connection successful!");

                    Console.WriteLine("--> Issuing PUT");
                    SnowflakeDbCommand command3 = new SnowflakeDbCommand((SnowflakeDbConnection)conn, 
                        "PUT 'file://C:/temp/directory name with spaces/*.*' @GH996 OVERWRITE=TRUE");
                    command3.ExecuteNonQuery();
                    /*
                     * or even with command.ExecuteReader()
                    System.Data.Common.DbDataReader reader3 = command3.ExecuteReader();
                    
                    while (reader3.Read())
                    {
                        Console.WriteLine(reader3.GetString(6));
                    }
                    */
                    string queryIdPut = command3.GetQueryId();
                    Console.WriteLine("queryIdPut: " + queryIdPut);

                }
            }
            catch (DbException exc)
            {
                Console.WriteLine("Error Message: {0}", exc.Message);
            }
        }
    }
}

result:
in both attempts , files are uploaded to the stage as expected.

ls @test_db.dotnet.gh996;
name | size | md5 | last_modified
-- | -- | -- | --
gh996/001 dummy textfile with spaces.txt.gz | 37520 | 70e620647dae961531609a0add39a712 | Mon, 22 Jul 2024 06:10:59 GMT
gh996/abd dummy textfile with spaces.txt.gz | 37520 | 4a61e8a969c909e40078070a8864a988 | Mon, 22 Jul 2024 06:10:59 GMT

so far, working as expected. I also took a look at the queryId you sent in the issue submission (01b5be10-0001-2942-0001-f7f20001601e), and it looks to have the query like this on the Snowflake side:

PUT 'file://test1.txt.gz @SE4_TEST_STAGE auto_compress=true

closing apostrophe missing. It is not even with a directory name with spaces, so maybe this queryId does not belong to the issue you're describing?

anyhow; since the use-case seems to be working to me as expected and documented, could you please provide further details on

  • file names (casing, etc) in the specific directory with spaces, with which you can reproduce the issue - perhaps there is some pattern which triggers the issue and I wasn't hitting yet?
  • full runnable program (not just code snippet) which when run, leads to the issue you're observing

This could help a lot in debugging the issue which is not reproducible for me. Thank you in advance !

@sfc-gh-dszmolka sfc-gh-dszmolka added question Issue is a usage/other question rather than a bug status-information_needed Additional information is required from the reporter labels Jul 22, 2024
@alex-sequentum
Copy link
Author

Thanks for looking into this.
I have just tried your code, I did copy and paste into my Visual Studio. I also use the same directory structure. Now, I am getting the following error
image
I have done a bit more digging and checked Snowflake Query History, I can see that the above source code tries to execute my PUT query twice for some reason. I also noticed that it strips the second quotation mark and I believe that is the reason why it fails.
image
It is not completely clear to me why it removes the second quotation mark during the execution. Is there a special way of escaping a single quotation mark in this case?
If we use the same code, I am wondering if there could be something unique about how my Snowflake database has been setup / configured which leads to this error?

@sfc-gh-dszmolka
Copy link
Contributor

This is indeed weird, and I'm wondering if the IDE itself has anything to do with the issue.
The driver should not iterate through the whole directory and issues a separate PUT for each file it finds there; it only issues one single PUT and then all files matching the filter will be attempted to PUT.

Speaking about which, as a mitigation and to hopefully become unblocked with your use-case, can you perhaps try issuing

"PUT 'file://C:/temp/directory name with spaces/*' @GH996 OVERWRITE=TRUE"

instead of the *.* matcher ? Should behave the same, * alone being able to put everything in the given directory.

To troubleshoot the issue further, can you please

  1. try executing the same program from
  1. enable TRACE-level logging for the driver and gather logs from a reproduction; hopefully could provide further pointers for executing PUT twice (once even incorrectly)

@alex-sequentum
Copy link
Author

I have managed to find a workaround and get this working by simply removing a space after the second quotation mark. If there are no spaces between ' and @, then it works.
The following works for me

SnowflakeDbCommand command3 = new SnowflakeDbCommand((SnowflakeDbConnection)conn,
     "PUT 'file://C:/temp/directory name with spaces/*.*'@GH996 OVERWRITE=TRUE");

image

image

@sfc-gh-dszmolka
Copy link
Contributor

good to hear you found a workaround ! I'm going to leave this open for a while to see if you have a chance to try the other things suggested too; or if other people have the same issue which was not yet reproducible for me.

@sfc-gh-dszmolka sfc-gh-dszmolka added status-triage_done Initial triage done, will be further handled by the driver team and removed status-triage Issue is under initial triage status-information_needed Additional information is required from the reporter labels Jul 24, 2024
@alex-sequentum
Copy link
Author

Unfortunately, the workaround I mentioned won't work if both file path and internalStage have spaces. For example

SnowflakeDbCommand command3 = new SnowflakeDbCommand((SnowflakeDbConnection)conn, "PUT 'file://C:/temp/directory name with spaces/*.*' '@GH996 with space' OVERWRITE=TRUE");

It would great to find a real solution to this issue.

I attached extract from my log
log1.log

I use Visual Studio on WIndows for development, and I have not tried a different environment.

@sfc-gh-dszmolka
Copy link
Contributor

thank you for sending the log, for some reason indeed two queries are sent instead of one.

the

"PUT 'file://C:/temp/directory name with spaces/*.*' '@\"GH996 with space\"' OVERWRITE=TRUE");

format worked for me (maybe not very intuitive how to use stages with spaces, but at least it's documented)

can you please try it? also maybe using * instead of *.* as recommended above.

Sadly since in my Visual Studio 2022 (17.9.6, Windows 10, 64-bit) it's not reproducible and it doesn't seem to be possible for you to try another IDE or even the NET CLI, can you perhaps try from the same version of Visual Studio or let me know which one you use so I could try reproducing on the same version?

@alex-sequentum
Copy link
Author

It appears the issue is with the getFilePathFromPutCommand function in SFFileTransferAgent.cs .

        private string getFilePathFromPutCommand(string query)
        {
            // Extract file path from PUT command:
            // E.g. "PUT file://C:<path-to-file> @DB.SCHEMA.%TABLE;"
            int startIndex = query.IndexOf("file://") + "file://".Length;
            int endIndex = query.Substring(startIndex).IndexOf('@') - 1;
            string filePath = query.Substring(startIndex, endIndex);
            return filePath;
        }

It does not handle correctly cases if file path or/and internalStage have single quotation marks around them, for example

"PUT 'file://C:/temp/directory name with spaces/my file.txt' '@\"GH996 with space\"' OVERWRITE=TRUE"
"PUT 'file://C:/temp/directory name with spaces/my file.txt' @GH996 OVERWRITE=TRUE"

@sfc-gh-dszmolka
Copy link
Contributor

sfc-gh-dszmolka commented Jul 25, 2024

this is a good catch indeed! wondering why it behaves entirely differently for me with same Snowflake .NET driver 4.0.0.0 and .NET 8.0.
edit: also this code is here from 2 years ago, from when we started supporting PUT in the driver, not a new addition. I'm wondering no one else hit this problem in 2 years ?

Anyways, that's something we can look at. For posterity, prepared repro program without Snowflake to test output for the function:

using System;
					
public class Program
{
	public static void Main()
	{
		string getFilePathFromPutCommand(string query)
        {
            // Extract file path from PUT command:
            // E.g. "PUT file://C:<path-to-file> @DB.SCHEMA.%TABLE;"
            int startIndex = query.IndexOf("file://") + "file://".Length;
            int endIndex = query.Substring(startIndex).IndexOf('@') - 1;
            string filePath = query.Substring(startIndex, endIndex);
            return filePath;
        }
		string query1 = "PUT 'file://C:/temp/directory name with spaces/my file.txt' '@\"GH996 with space\"' OVERWRITE=TRUE";
		string query2 = "PUT 'file://C:/temp/directory name with spaces/my file.txt' @GH996 OVERWRITE=TRUE";
		string query3 = "PUT 'file://C:/temp/directory name with spaces/*.*' @GH996 OVERWRITE=TRUE";
		string query4 = "PUT 'file://C:/temp/directory name with spaces/*.*' '@\"GH996 with space\"' OVERWRITE=TRUE";
		Console.WriteLine(getFilePathFromPutCommand(query1));
		Console.WriteLine(getFilePathFromPutCommand(query2));
		Console.WriteLine(getFilePathFromPutCommand(query3));
		Console.WriteLine(getFilePathFromPutCommand(query4));
	}
}

Outputs

C:/temp/directory name with spaces/my file.txt' 
C:/temp/directory name with spaces/my file.txt'
C:/temp/directory name with spaces/*.*'
C:/temp/directory name with spaces/*.*' 

with a little extra at the end of each string.

In the meantime, I'm wondering (even if trying another IDE, NET CLI, etc. is apparently not even an option here), using * instead *.* in your program makes any difference ?

@sfc-gh-dszmolka sfc-gh-dszmolka added bug and removed question Issue is a usage/other question rather than a bug labels Jul 25, 2024
@alex-sequentum
Copy link
Author

Using * instead *.* does not makes any difference for me.
Specifying a single file also does not work for me, for example
"PUT 'file://C:/temp/directory name with spaces/my file.txt' @GH996 OVERWRITE=TRUE";

Unfortunately, I do not have time and capabilities to test this in another environment.

Just simply by looking at the getFilePathFromPutCommand function, it is difficult to imagine how it can possibly work correctly if there are single quotation marks.

When do you think this bug will be fixed? When the new release be available?

@sfc-gh-dszmolka
Copy link
Contributor

Currently i unfortunately cannot even estimate when the bug will be fixed, but I will keep this thread posted with every new information once they are available.

As a workaround, please use directory/file/stage names without spaces until it is fixed.

In the meantime, if this bug causes significant impact to your business, kindly reach out to your Snowflake account representative and let them know about it. They can help prioritize.

@sfc-gh-dszmolka sfc-gh-dszmolka added the status-pr_pending_merge A PR is made and is under review label Nov 29, 2024
@sfc-gh-dszmolka
Copy link
Contributor

addressed with #1066, in review

@sfc-gh-dszmolka sfc-gh-dszmolka added status-fixed_awaiting_release The issue has been fixed, its PR merged, and now awaiting the next release cycle of the connector. and removed status-pr_pending_merge A PR is made and is under review labels Jan 8, 2025
@sfc-gh-dszmolka
Copy link
Contributor

fix is merged and will be part of the next release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug status-fixed_awaiting_release The issue has been fixed, its PR merged, and now awaiting the next release cycle of the connector. status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

3 participants