Latin Characters with accents

(Soroosh Nazem) #1

Hello Everyone

I am trying to put some texts in Italian language in my neo4j server. I need to know how I can handle the latin characters with accents. it already substitutes the latin characters like ò,à,ù with crazy characters. Can some body tell me how I can handle this problem?

Thank you



(M. David Allen) #2

Neo4j supports storing UTF text, so if the data is input properly, it should come out properly.

Now the crazy characters you're seeing - this may be a result of another software layer. Can you describe exactly how you're getting those texts into neo4j, and when they come out, how they are being displayed?

1 Like

(Soroosh Nazem) #3

Actually, the data are already stored on a MySQL server, from there I download the data on my pc, then in python by using py2neo package, I try to write all data to a remote neo4j db.
Let me just show you some part of the text:
"...... è un prodotto che ..... di collegare.........o ricaricare .......... All'interno del concetto di smart...."
For privacy I had to cut a lot from the text. By the way, as you may see here are two letters : è and all'interno.
"è" is substituted by "è". For "All'interno", I have been obliged to open the JSON file and substitute all " ' " with " ' ". because when I download data "All'interno" is written like "All'interno"


(M. David Allen) #4

The py2neo code here matters a lot. As I was saying neo4j knows how to store and repeat the right data, but what your'e describing sounds like the unicode handling in your python code may not be correct, and the problem you describe can certainly arise from this instance.

A thing I can suggest is that you post some of your mysql / python / neo code into the python topic, and then ask for help there about this particular issue. I'm not super deep in python but I know that between python 2 and 3 there were major changes to unicode support, and this is definitely something that could arise from a fairly simple mistake there

1 Like

(Soroosh Nazem) #5

BY the way, I am using Python 3. I selected encoding to be utf-8 and it worked. Thank you

1 Like