-
-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Import of Windows-1252 encoded file looses prolog and becomes mangled UTF-8 #5430
Comments
@ahenket I think this issue is addressed and fixed in the develop branch. Would you be able to test it with latest develop and confirm? |
Tried but it would appear not: % mvn -e -X -DskipTests package
....
[INFO] eXist-db Distributions ............................. FAILURE [ 2.369 s]
...
[ERROR] Failed to execute goal com.github.monkeywie:copy-rename-maven-plugin:1.0:rename (rename-jetty-etc-dir-for-appassembler) on project exist-distribution: could not rename /Users/ahenket/Development/GitHub/eXist/exist/exist-distribution/target/exist-distribution-7.0.0-SNAPSHOT-dir/etc/org/exist/jetty/etc to /Users/ahenket/Development/GitHub/eXist/exist/exist-distribution/target/exist-distribution-7.0.0-SNAPSHOT-dir/etc/jetty: Failed to delete /Users/ahenket/Development/GitHub/eXist/exist/exist-distribution/target/exist-distribution-7.0.0-SNAPSHOT-dir/etc/jetty while trying to rename /Users/ahenket/Development/GitHub/eXist/exist/exist-distribution/target/exist-distribution-7.0.0-SNAPSHOT-dir/etc/org/exist/jetty/etc -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal com.github.monkeywie:copy-rename-maven-plugin:1.0:rename (rename-jetty-etc-dir-for-appassembler) on project exist-distribution: could not rename /Users/ahenket/Development/GitHub/eXist/exist/exist-distribution/target/exist-distribution-7.0.0-SNAPSHOT-dir/etc/org/exist/jetty/etc to /Users/ahenket/Development/GitHub/eXist/exist/exist-distribution/target/exist-distribution-7.0.0-SNAPSHOT-dir/etc/jetty
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:215)
...
Caused by: java.io.IOException: Failed to delete /Users/ahenket/Development/GitHub/eXist/exist/exist-distribution/target/exist-distribution-7.0.0-SNAPSHOT-dir/etc/jetty while trying to rename /Users/ahenket/Development/GitHub/eXist/exist/exist-distribution/target/exist-distribution-7.0.0-SNAPSHOT-dir/etc/org/exist/jetty/etc
at org.codehaus.plexus.util.FileUtils.rename (FileUtils.java:2092)
...
[ERROR]
... |
@ahenket |
Maybe spinning up a docker container is more easy: |
@dizzzz you are right that is way easier for one-off tests |
Describe the bug
When I import attached file through oXygens xmlrpc connection (eXide doesn't let me: different issue), eXist-db looses the prolog that lists that the file is Windows-1252, but does not convert the file into UTF-8. So when you reopen it, it uses the xml default encoding UTF-8 and all characters outside of ASCII are now broken.
Expected behavior
Either keep the encoding of uploaded files, or do an on the fly conversion before committing to the database
To Reproduce
Extract the one file from the zip and upload that file anywhere on your server. Now reopen using oXygen or eXide and look for "pati". The first hit reads "pati�nt" instead of "patiënt" and is in this path: /XMI/XMI.content[1]/UML:Model[1]/UML:Namespace.ownedElement[1]/UML:Package[1]/UML:Namespace.ownedElement[1]/UML:Collaboration[1]/UML:Namespace.ownedElement[1]/UML:ClassifierRole[2]/UML:ModelElement.taggedValue[1]/UML:TaggedValue[1]/@value
nl.zorg.Zwangerschap-v4.1.xmi.zip
There are 27 occurrences of � that were a regular Windows-1252 compatible characters before.
Environment
The text was updated successfully, but these errors were encountered: