Archive for November, 2008

This comparison also reveals a difference between the positivist and interpretive, or hermeneutic approach to the interpretation of myths. Positivists read myths literally and find them false and foolish; interpretivists read them metaphorically or allegorically and find them true and profound.

A well-known scientist (some say it was Bertrand Russell) once gave a public lecture on astronomy. He described how the earth orbits around the sun and how the sun, in turn, orbits around the center of a vast collection of stars called our galaxy. At the end of the lecture, a little old lady at the back of the room got up and said: “What you have told us is rubbish. The world is really a flat plate supported on the back of a giant tortoise.” The scientist gave a superior smile before replying, “What is the tortoise standing on?” “You’re very clever, young man, very clever,” said the old lady. “But it’s turtles all the way down!”

Stephen Hawking, A Brief History of Time

Comments No Comments »

I finally managed one critical simplification that enabled me to complete the database model. Today, I released a new version of my kanji quiz program that used a database instead of flat files, resulting in a huge speed improvement and increased flexibility.

So, what was my simplification? Avoiding radicals altogether.

Well, okay, I didn’t go quite that far. What I did was ignored any kanji composition information that didn’t consist only of other kanji. So the kanji for “time” still shows, when you run the program, that it’s made up of “sun” and “temple”. But anything with the “grass radical” doesn’t have composition information any more.

An unfortunate simplification, to be sure, but one that I think was necessary for me to get a release finished in any reasonable time period. In the near future, I hope to revisit that decision and figure out some way of enabling non-kanji radical information to be present and displayed. First, though, I think I’ll write a tool that allows editing and revising of that information—already I’ve noticed some inaccuracies in the current kanji-only data that I have.

To make up for this reduction in information, however, I’ve added a new feature: kanji compound words. Now, when viewing kanji information, you can click a button and get a list of all “edict” words that contain that particular kanji. This really helps memorization—more so, I think, even than the radical information.

For example, the word for photograph—”sha-shi-n”—is made up of two characters that I didn’t know before. The characters themselves are not used very often in isolation, so it’s difficult to remember their basic meaning. But when I remember that they’re part of “photograph”, I easily recall the readings and a basic idea of what each means.

So, what’s next on my plate? Enabling the user to select different kanji lists, rather than being stuck on the two hard-coded JLPT lists. Maybe after that I’ll get bored and move on to the next project …

Comments No Comments »

In my last post, I talked about the composition of Japanese kanji characters, and how it’s been giving me difficulties as I try to create a good database model for my kanji quiz program.

Probably the biggest problem is that not every sub-character is a complete kanji character. The four I mentioned above—sun, temple, earth, and inch—are all kanji characters that can be found alone quite commonly. However, the grass radical (which can be found in kanji such as “tea”, “flower”, and of course “grass”) is not a kanji character itself. You’ll never see it on its own—there’s no Unicode character for it—and there’s not even a standard way of referring to it. One source might call it “ku-sa-n-mu-ri”, while another calls it “Bushu 140, Variant 2″ while yet another refers to it simply as “Element #1783″. How should my database refer to it? Should I give it yet another arbitrary number, or should I use one of the names somebody else uses?

Another problem is that, of these many sources I’ve looked at, none are complete. In fact, most of them are not only woefully incomplete, but in some cases simply wrong. So I not only have to deal with incomplete data, but I also have to deal with incorrect data, and ensure that whatever format I use in my own database, it’s easy to change or update when I come across incorrect data that I imported from elsewhere.

So, after puzzling through this problem all weekend, and attempting to drastically simplify all my assumptions and use cases so I could cut this down to something manageable—even if it had to be reworked significantly later—I find myself no closer to my final goal of having a working database for my kanji quiz program. I’ve caught myself going down numerous dead ends, realizing the flaws in my implementation or data model, then heading down another path that failed for different reasons. I keep trying to carve off pieces of the headache-inducing problem, trying to get down to a smaller and smaller piece until I’ve finally got something small enough to chew, but I still end up with something too large for me to reasonably tackle on my own.

I feel like I’m getting closer—every so often I can catch a glimpse of the light at the end of the tunnel—but then I crash into a wall I had forgotten about or a new wall I hadn’t yet encountered, and I wonder exactly how far away I really am.

Comments No Comments »

My Tsumego quiz program got a lot of action last week! Many people tried it out, and a dozen or so people gave comments. One person even emailed me directly, asking for a particular feature. So, time to stop working on the Japanese quiz program and get back to work on the Tsumego quiz program, right?

Well, not quite. First of all, this week somebody also emailed me about the Japanese quiz program, suggesting some improvements to it. But more to the point, these programs are my hobbies that I work on in my spare time, so I work on what I feel like working on, not necessarily what people are clamoring for! And I wanted to work on the Japanese program.

Last week, I mentioned that I’d done the final bit of refactoring to abstract out the nasty flat-file system so it could be more easily changed to be backed by a database. So, this weekend, I really wanted to get the database schema sorted out, and a solid plan for how things should look in the new, faster, easier-to-maintain world. Unfortunately, this was not to be. Once again, I ran into the problem of how to model the kanji radicals.

As I’ve mentioned before, each kanji character in Japanese can be broken up into smaller pieces. For example, the kanji for “time” is made up of two smaller characters, one for “sun” and one for “temple”. The kanji for “temple” is also made up of two smaller characters, one for “earth” and one for “inch”. Native Japanese speakers don’t typically think of these sub-characters any more than we think of the etymologies of the words we use, but they can be very useful for non-native speakers to use as mnemonics for both meaning and for stroke order.

Unfortunately, there are significant complexities when it comes to modelling this data in a programmatic way. I’ll talk more about these nasties in my next post.

Comments No Comments »

With Russian tanks only 30 miles from Tbilisi on August 12, Mr. Sarkozy told Mr. Putin that the world would not accept the overthrow of Georgia’s Government. According to Mr. Levitte, the Russian seemed unconcerned by international reaction. “I am going to hang Saakashvili by the balls,” Mr. Putin declared.

Mr. Sarkozy thought he had misheard. “Hang him?”—he asked. “Why not?” Mr. Putin replied. “The Americans hanged Saddam Hussein.”

Mr. Sarkozy, using the familiar tu, tried to reason with him: “Yes, but do you want to end up like [President] Bush?” Mr. Putin was briefly lost for words, then said: “Ah—you have scored a point there.”

http://www.timesonline.co.uk/tol/news/world/europe/article5147422.ece

Comments 1 Comment »

This has to be the very nerdiest thing I’ve seen in a long, long time.

Remember the Erector set? A bunch of metal pieces with nuts and bolts to stick’em together, and a little battery-powered motor to make the toy crane or train or whatever actually move?

Well, there’s a company called 80/20 that makes what they call an “Industrial Erector Set”: the same fundamental principle as the Erector set of yesterday—simple individual parts that can be combined in many different ways—but taken to the “industrial” level. The devices made from parts they provide can actually be used in production environments to do real things—they’re not just toys for kids to play with.

So, this company, using their “industrial” erector set parts, put together a device of sorts that could manipulate a Rock Band guitar. As if this wasn’t geeky enough, they then hooked up a camera and a computer to this device. The camera watches the screen, as a human would, detects the symbols that mean a button on the guitar should be pressed, and then sends synchronized signals to the guitar-bot that will play Rock Band (near) perfectly.

Truly, this is a triumph of amazingly geeky proportions.

Comments 2 Comments »

Today, I finished up enough to make a first release of my tsumego program. Thankfully, the web site I mentioned last week is back up and running, so hopefully this week I’ll get enough time to package everything together, make a post about it, and start getting some feedback. I finished the problem serialization (which was most of the work), then added some problem de-duping, and fixed up the statistics placeholders. Everything now seems to be working just fine.

Fortunately, all the quiz work I’ve been doing will carry over quite nicely to the old Japanese quiz program project. Somebody actually emailed me this week saying that they really like the program, and sent me a patch for a minor bug it had. They also asked me if I could add more kanji, so I’ll probably start working on that next. I’ve been trying to get rid of the nasty, nasty flat-file format underneath everything, and today I managed to finish up the last bit of refactoring that gets rid of any direct dependencies on that flat-file format. Now I have a solid API that I can write my database code against, and once I have that API completed, the flat-file code can disappear forever.

Speaking of kanji, I originally wrote the program to study for the JLPT—the Japanese Language Proficiency Test—and I just got an email from the test organizers saying that they’ve sent my admission voucher in the mail, and I should be getting it shortly. The test is on the first Sunday of December every year—if my calculations are correct, that puts it at four weeks from today. I’m excited! I’m only taking the very easiest level (JLPT 4), but it’s not a trivial test to pass. I’ll have to study hard!

Comments 1 Comment »

My tsumego quiz program is coming along nicely. I have a database of half a dozen or so problem trees, making a little over a hundred individual tsumego problems. The GUI is more or less complete, and the Leitner engine from my Japanese quiz program has been separated into its own module and successfully integrated with this new program.

There’s still plenty to do, though, even before a first release. I haven’t come up with a good way of serializing the problems to disk, so progress can’t be saved yet. I’m also not doing any sort of de-duping in the problem trees, so sometimes the same problem shows up multiple times, even though it should really only appear once. And there are some statistics that I’d like to show, but currently just have placeholders for.

Unfortunately, the web site I typically post on and read for all my Go-related needs has been out of commission for nearly a week now. Maybe that means I can get an initial release ready before the web site comes back—but much of my motivation for Go-related endeavors comes from that site, so without that source of inspiration, who knows how long I’ll have before the fickle winds change and I’m blown to a different, newly-exciting project.

Comments 1 Comment »