Mapping List of Unicode Strings in SQLAlchemy
Table of Contents
Problem mapping JSON or List of Strings in SQLAlchemy #
When storing JSON data or “list of strings” data in sqlite using SQLAlchemy, mapping types is not easy.
class Words(Base):
__tablename__ = "words"
id: Mapped[int] = mapped_column(primary_key=True)
word: Mapped[list[str]] = mapped_column(String)
Mapping word
to mapped_column(String)
gives error when inserting ["foo", "bar"]
to word
:
sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) Error binding parameter 2: type 'list' is not supported
Solution to Inserting List of Strings #
In Postgres, you can use ARRAY(String)
instead, but in Sqlite, ARRAY
does not exist.
So, I found out that you can map JSON
instead of String
.
from sqlalchemy.types import JSON
class Words(Base):
__tablename__ = "words"
id: Mapped[int] = mapped_column(primary_key=True)
word: Mapped[list[str]] = mapped_column(JSON)
Then, the list of strings can be inserted to word
.
The Next Problem (Escaped Unicode Sequences) #
Because I often use Korean texts, when I insert a list of Korean strings, SQLAlchemy escapes Korean Unicode when serialized to JSON using default json.dumps()
.
So when storing ["νκΈ","μμ΄"]
, it is saved as "\uxxxx \uxxxx"
which does not help on reading the value when querying using sqlite3 or litecli.
To store the Unicode as is, you need to make a custom type so that json.dumps("string", ensure_ascii=False)
can be set.
Solution to Escaped Unicode Sequences #
from sqlalchemy.types import Text, TypeDecorator
import json
class UnicodeJSON(TypeDecorator):
impl = Text
def process_bind_param(self, value, dialect):
if value is None:
return None
return json.dumps(value, ensure_ascii=False)
def process_result_value(self, value, dialect):
if value is None:
return None
return json.loads(value)
class Words(Base):
__tablename__ = "words"
id: Mapped[int] = mapped_column(primary_key=True)
word: Mapped[list[str]] = mapped_column(UnicodeJSON)
Now, when Korean is inserted to word
, the characters are stored as is.