Sunday, August 25, 2013

Unicode sort order for CharField data in several scripts

Gentlefolk,

We have a research database (GeoDjango 1.5.1 on Postgres 9.2/PostGIS 2.0) including one model for words in any human language, where these words are entered in locally legible scripts (thus Sanskrit or Newari terms are in Devanagari, Persian in Perso-Arabic, Mandarin in Traditional Chinese, and Latin or English in Roman script). The problem is that the only script that sorts correctly is the Roman. This makes sense given the limits of LC_COLLATE in Postgres; collations are always local, such as en_gb.UTF8, and the only 'global' collations are C and POSIX, which sort by bit order and not by the Unicode Collation Algorithm. So far as I know there is no implementation of the Unicode Collation Algorithm within Postgres.

There is a pyuca module available on pypi, but I'm not a good enough coder to see how to wire it into the Django ORM to enable true Unicode sorts. Has anyone tackled this problem before? 


The relevant bit of the model reads:

class Name(models.Model):
    def __unicode__(self):
        return self.nomen + u" (" + self.language + u")"
    
    class Meta:
        ordering = ('language', 'nomen',)
        verbose_name = 'lexical item'
        verbose_name_plural = 'lexical items'
    
    name_id = models.AutoField(primary_key=True)
    uuid = uuidfield.UUIDField(auto=True)
    nomen = models.CharField(max_length=200, help_text="Please enter the name in an accurate script.")
    language = models.CharField(max_length=40, default="Latin", help_text="Please use standard language names or codes as defined in the ISO 639 standard.")


Many thanks,

-WBTD.
- - -- --- ----- -------- -------------
Will Tuladhar-Douglas
University of Aberdeen

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
For more options, visit https://groups.google.com/groups/opt_out.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home


Real Estate