Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work with BOM unicode files #7

Open
GoogleCodeExporter opened this issue Jan 7, 2016 · 0 comments
Open

Work with BOM unicode files #7

GoogleCodeExporter opened this issue Jan 7, 2016 · 0 comments

Comments

@GoogleCodeExporter
Copy link

The plugin can't work with unicode wile which start with BOM index. There is 
easy solution how to solve this problem. The comons-io (which is used in 
project) have ability to work with this type of files.

Changes in pom.xml
@@ -70,11 +70,11 @@
                                <groupId>org.apache.maven.plugins</groupId>
                                <artifactId>maven-compiler-plugin</artifactId>
                                <configuration>
-                                       <source>1.4</source>
-                                       <target>1.4</target>
+                                       <source>1.6</source>
+                                       <target>1.6</target>
                                </configuration>
                        </plugin>
@@ -99,7 +99,7 @@
                <dependency>
                        <groupId>commons-io</groupId>
                        <artifactId>commons-io</artifactId>
-                       <version>1.3.1</version>
+                       <version>2.4</version>
                </dependency>

Changes in AbstractDBMojo.java:
@@ -19,6 +19,8 @@
 import java.util.List;
 import java.util.zip.GZIPInputStream;

+import org.apache.commons.io.ByteOrderMark;
+import org.apache.commons.io.input.BOMInputStream;
 import org.apache.commons.lang.StringUtils;
 import org.apache.maven.plugin.AbstractMojo;
 import org.apache.maven.plugin.MojoExecutionException;
@@ -233,14 +240,13 @@

         // check encoding
         checkEncoding();
-
+
         // our file reader
-        Reader reader;
-        reader = new InputStreamReader(ips, scriptEncoding);
-
+        Reader reader = inputStreamToReaderBOM(ips);
+
         // create SQL Statement
         Statement st = con.createStatement();
-
+
         StringBuffer sql = new StringBuffer();
         String line;
         BufferedReader in = new BufferedReader(reader);
@@ -323,17 +328,16 @@
             ips = new GZIPInputStream(ips);
             getLog().info(" file is gz compressed, using gzip stream");
         }
-
+
         // check encoding
         checkEncoding();
-
+
         // our file reader
-        Reader reader;
-        reader = new InputStreamReader(ips, scriptEncoding);
-
+        Reader reader = inputStreamToReaderBOM(ips);
+
         // create SQL Statement
         Statement st = con.createStatement();
-
+
         StringBuffer sql = new StringBuffer();
         String line;
         BufferedReader in = new BufferedReader(reader);


+
+    private Reader inputStreamToReaderBOM(InputStream in) throws IOException {
+        BOMInputStream bOMInputStream = new BOMInputStream(in,
+            ByteOrderMark.UTF_16LE, ByteOrderMark.UTF_16BE,
+            ByteOrderMark.UTF_32LE, ByteOrderMark.UTF_32BE);
+        ByteOrderMark bom = bOMInputStream.getBOM();
+
+        String charsetName = bom == null ? scriptEncoding : 
bom.getCharsetName();
+
+        return new InputStreamReader(new BufferedInputStream(bOMInputStream), 
charsetName);
+    }
+


Original issue reported on code.google.com by [email protected] on 2 Feb 2015 at 7:00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant