UTF-8, UCS-2 in J

If a language allows you to call a function from a dll, what would you do first? Of course you call MessageBox! Well, it turned out a bit harder than I had though. If something is wrong in parameters or specification, it just doesn't work without giving you any idea what's wrong. But when it's done, it looks very easy of course :)

'User32 MessageBoxA >i x *c *c x' cd 0;'Hello world!';(,'J');0   NB. cd = 15!:0 *** You see? 'J' is scalar! Duh!

Ok, when it's done, what's next? Of course it's 'Hello world!' in some non-English language! And it's never easy for sure. Here the quirks are: 1) MessageBoxW accepts wchar strings, but they must be passed as normal strings; 2) There are lots of encoding conversion functions in J (well, actually one function u: but with 8 modes of conversion), but there's no conversion string→string with the transformation from UTF-8 to UCS-2! It could be (6&u:)^:_1 but it doesn't exists. So I had to deal with raw bloody bytes.

title=: ,(0{a.),.~'J i18n'   NB. ASCII to UCS-2 is easy
u=: a.{~128+80 63 81 0 80 56 80 50 80 53 81 2   NB. привет in utf-8
w=: 7 u: u   NB. wchar string, but we need plain string with same contents!
z=: (+:#w){.20}.0(3!:1)w   NB. same bytes as in wchar string
'User32 MessageBoxW >i x * * x' cd 0;z;title;0   NB. finally :)

P.S.
W=: 3 : '2 u: 7 u: y'   NB. reverse for it (wchar to utf8) is 8&u:

Free Web Hosting