Random Thoughts

This is going to be another boring post. It’s specifically regarding how the app shuffles cards and randomness in general but its also going to get into things like statistics and heuristics.

The short story is that new cards show up in order during their first study session. This is because I like to edit my cards from the Study View and I’m usually working from a list. If you don’t want new cards to show up in order there is a setting you can enable to randomize them. For cards that are not new, I use arc4random_uniform([array count]) to choose random numbers and a Fisher-Yates shuffle to actually shuffle the cards. Even though these procedures may seem to be the best and/or most standard ways of getting random numbers and shuffling a set, you might still notice patterns and/or the resulting shuffles might not seem random to you. How can that be?

Computers are not really capable of choosing things at random. Generally speaking, when one asks a computer to generate a random number it must go through some process to create that number. It might use the current time or some other internally-known condition (voltage, temperature, etc.) as a basis for generating the number. If we could recreate the precise condition or conditions that existed in a system when it generated a random number, we could cause that system to generate the same random number over and over again. That’s not truly random. The same thing is true of the universe. If you precisely reproduce the conditions of the universe (and I mean the precise position, velocity, and spin of every fundamental particle in existence), you will always get the same outcome (quantum fluctuations not withstanding). Even though you may think you have free will, you wouldn’t be able to change your mind if everything were precisely the same. Given the same conditions, you must always choose Pepsi instead of Coke. Given the same conditions.

I personally prefer Coke, by the way. But there must be some conditions that would cause me to choose Pepsi. I can’t really imagine what those conditions might be but if they existed, I must choose Pepsi. And if we recreated those conditions precisely through many trials, I would choose Pepsi every time. Get it? I would not be able to change my mind.

But, can that really be true? Certainly, you can change your mind and random things do happen all the time, right? Well, now we’re getting into a philosophical debate. What does random mean? Does it mean something that seems random to people and not something that is actually random in the mathematical sense? If we’re only concerned about things that seem random to people then, yes, randomness does exists. If we’re talking about mathematical randomness…actually, I’m not sure.

I think truly random things may not exist (again, ignoring the quantum world) but what we can do is try to reach a state of being as random as possible and call that state mathematical randomness. It’s just like circles. While we can mathematically describe a circle, in fact there seem to be none. There seems to be nothing in existence that is actually round. There are very round things. Things that are so round that they are almost truly round. Things like the sun – the roundest thing we know of. But we are capable of measuring the fact that it is not round. What we mean when we call something round is that it is round enough to be practically round. What I mean when I say random is just random enough to be practically random.

So, let us ignore the possibility that randomness may not even truly be possible. Let’s just establish a model with two phrases: seems to be random and mathematically random. Seems to be random will simply mean that it’s difficult for the average person to notice a pattern and mathematically random will mean that it’s actually approaching a smooth distribution curve.

When I implement a random function, such as a function that should shuffle or choose virtual flashcards at random, do I want mathematical randomness or do I want something that seems random to people? Well, wait a second. Shouldn’t that be the same thing? If it’s as random, or pseudo-random as it were, as it’s possible for a computer to be, won’t that also seem random to people?

Almost certainly not. Humans, generally speaking, are horrible at statistics. We do not have a natural affinity for it. What we do have a natural affinity for is patterns. We seek them out. Visual patterns such as faces. We see them where they do not exist. We see faces in burnt toast and in clouds and in mountain shadows on Mars. There are no faces there but we see them all the same. We also seek out numerical patterns. No, I’m not talking about those math problems from your tests in junior high school. I’m referring to numerical patterns that we may not even realize are related to numbers and/or statistics such as patterns in the occurrence of events.

There is a very good reason for this pattern seeking behavior. It’s so you don’t have to waste time and energy actually using your brain to think about things all the time. Your brain is an amazing piece of biological machinery. And like many other amazing pieces of machinery, it’s an energy hog. Despite what you may have heard or read elsewhere, unless you had a developmental abnormality or some sort of inadvertent head trauma, you use 100% of it at almost all times and it consumes as much as 30% of the energy your digestive system extracts from the food you eat. Actually, I’m not sure about that energy number as I’ve seen estimates ranging from 20-25% and heard seemingly well-informed people mention numbers as high as 30%. Anyway, it is the single most draining organ that you posses. That’s good because it makes you really smart. But it’s also kinda bad because it consumes a lot of energy. Because of this, it has various built-in efficiency systems that it makes use of in order to keep the amount of energy it requires to a minimum.

One of these systems or perhaps series of systems manifests itself as something known as heuristics. This goes directly back to patterns in the occurrence of events. It’s quite simple to see A and then B and know that A caused B. Especially if we see this sequence occur over and over again. A and B may in fact be completely unrelated or due to some unobserved and unknown third condition but heuristics might cause us to simply link A and B. Likewise, if I see A, B, and then C and notice some similarity or relationship between the three (for example, they’re the first three letters of the Roman alphabet in order) then I might have a tendency to think that something fishy is going on if I had been told that I would be shown random letters.

But is it really fishy? Can I deduce fishiness after seeing only three trials? What’s the likelihood or mathematical probability that you would get A, B, and then C randomly? Certainly, it’s not 0 but I don’t know what the actual probability is because I’m not good at statistics. I could work it out but that’s not the point. The point is, I can’t immediately see the answer or even easily find it after a bit of thought. I would have to really think about it. Perhaps even get out a pencil and sheet of paper to figure it out.

I don’t think that you can really figure out anything statistically speaking after only three trials. In fact, you would probably need to see quite a large number of trials in order to get a good spread. The more the better. So, if you notice that a few of your cards show up in order or near each other or in reverse order or whatever pattern you happen to notice, what does that mean? What if the same pattern occurs twice in a row? Despite the fact that you may feel like something strange is going on, it’s probably quite normal. Even unlikely things have some probability associated with them. The fact that they are not impossible means that they must eventually occur. If you really want to show that something fishy is going on, do a thousand or so trials, work out the probability spread that you observed and compare it to what you would consider random. You’ll probably notice that there’s only a few tenths of a percentage difference between what you would expect and what you actually observed. Which would mean…nothing fishy.

I could implement checks to make sure that the seemingly fishy things don’t occur. I could make sure cards don’t get shuffled back into their original positions. I could ensure that cards that were created within a short interval of each other or cards that feature the same kanji don’t get shuffled into positions near each other. I could ensure that the same patterns that occurred in a previous shuffle don’t occur in subsequent shuffles. However, doing these things would actually make the shuffles less random. It would seem more random to the casual observer but these arbitrary tricks would actually destroy the overall distribution. So, even though seemingly fishy things might (or almost certainly will) occur, I think I’m going to just leave the current implementation as is. It may not seem very random but I actually think that means it’s approaching the standard of randomness I would prefer to see.

Happy studies!

What the user wants…the user will get: Study Due Variance

5 vs. 2

I’d like to talk a bit about how kanji Flow’s SRS implementation works.

This is going to be boring so I’ll leave the pictures out this time and just focus on the story.

Basically, I use the same SM2 algorithm that many other software programs use. If you look at that SM2 link, you’ll see that it can be kind of complicated. A bit of Googling will show you that there is some debate regarding if it really is good for memorizing or not and what the best implementation is. I’m not really interested in getting involved in that debate as I don’t have enough knowledge to offer any fruitful opinions. Basically, it seems to work pretty well and it makes things less complicated for me.

Of course, my implementation is customized to the way I like to study. I used to use the Leitner System so that basic format is mixed in. I like to review my cards steadily for a couple of days if I miss one and once I’ve memorized it again I want it to get an interval based off of its entire history. So, that’s basically how kanji Flow works. How does that compare to other software?

Anki is probably the most popular SRS software available. Anki uses more complicated intervals and gives you a few different options when you look at a card to determine when it should be scheduled again. I think you really need to understand that part of the documentation if you want to use Anki as effectively as possible. kanji Flow basically only has two choices: unknown or known. You can also pass but that doesn’t do any math on the card’s difficulty.

It may seem like kanji Flow doesn’t give you as many choices but I think it doesn’t really make much difference. Let’s say I gave you the option of choosing a card’s difficulty manually. You could select 1-5 and, for a particular card, selecting a 4 would cause the card to be due for study 17 hours earlier than selecting a 3. Mathematically speaking, it’s different. Practically speaking, it doesn’t matter. You’re still gonna get that card again on Thursday. Or, perhaps, Wednesday.

The problem (is it a problem?) is due to the granularity at which humans tend to schedule their activities. Most people use days to schedule their time and many people seem to like to do things like studying once a day. So those small differences in determining the card’s difficulty probably aren’t going to have much of an effect on when your cards are due. Knowing vs. not knowing the card has a big effect, though. So, that’s the only decision you have to make.

I used to use a time-based system that would set the card’s difficulty and study date based on how long you looked at it versus how long you normally looked at the other cards in the same lesson. A lot of testing showed me that it just really didn’t matter that much, so I took that code out. Just keep it simple: Do I know this or not? Actually, if you have to ask yourself that question, you don’t know it. Swipe it to the left and review it.

Happy studies!

丁重にお断りさせていただきます

I’d like to address some of the requests that I’ve gotten. I’m going to talk about the two requests that I’ve received the most and requests for additions in general.

Some people want to be able to write on the screen and have the app check to see if you wrote the kanji correctly. I will never add this functionality to the app for two reasons: (1) I’m not great with graphics stuff so it would take a lot of time and work to implement. (2) I would have to add a database of kanji to the app for such a feature to reference against. I don’t want to add a database to the app because it would dramatically increase the size and imiwa? is already available. I could access such a database via the internet but I also don’t want to require a network connection for anything unless it’s absolutely necessary (current text-to-speech technology inherently requires a network connection if your OS doesn’t have the technology built-in).

I also practice writing kanji. Certainly, it isn’t necessary considering current technology; you’ll probably only ever really need to be able to type the reading and select the correct kanji on your computer or phone. However, I personally feel that practicing writing helps me to recall kanji better. I might be wrong about that, however. I think if you want to practice writing, you should get out a sheet of paper and a pen. I know that might not be convenient on the train or bus but that’s the way I think you should do it. If you’re not going to use a pen and paper you may as well just write the kanji on your palm with your finger or just “write” it in your head. That’s what I do on the train sometimes. I just really don’t think that being able to write on the screen has any real value. I might be wrong about that too. You might disagree with me. If so, you should find another application that lets you do that. That functionality will never be in kanji Flow.

Some people want to be able to see kanji stroke orders for the kanji on their cards. Again, that would require adding a database of stroke orders (or accessing such a database via a network connection) and, again, that feature is already just a couple of taps away in imiwa?  I’m probably never going to mirror features that are already available in imiwa? even though it might be slightly more convenient. I really don’t think it’s so terribly inconvenient to just tap on the imiwa? link. Imagine if there were no dictionary integration at all. Then you’d have to manually copy and paste (or actually type it yourself!!!) into your dictionary or Google. I think what’s available now is pretty good and I’ll keep trying to improve the dictionary integration as much as possible. But, I’m not going to put such features directly into kanji Flow.

I’ve also received a lot of other requests. A lot of those requests have been implemented into the app. Things like new sorting options or things that make stuff the app already does easier or less annoying are generally easy to implement and I’m usually happy to do so. Things that the app doesn’t do at all probably aren’t going to be implemented ever. This is a flashcard application. If you’ve got a great idea for something else then that’s great and I appreciate you telling me about it but if that idea has nothing to do with kanji Flow then I really don’t know what else to say other than, thanks for telling me about that.

If you’ve got a great idea for an app but there doesn’t seem to be anything already on the store that meets your needs, I highly recommend Apress’s series of books if you’re interested in learning to program and making iPhone apps.

Happy Studies!

kanji Flow’s Philosophy

Well, actually, I guess it’s more like my philosophy about kanji Flow. Basically, I’d like to offer a bit of info about my motivations behind kanji Flow, why it looks the way it does, and what I might do with it in the future.

I made kanji Flow because I’m a student of Japanese and I needed a way to automate my memorization of kanji and vocabulary. Originally, I used a program on my old Windows Pocket PDA called Stackz. When I switched over to an iPhone, I wanted to keep using Stackz since I had already put so much time into my Stackz study decks. However, the developer of Stackz, MindDate Software, never made a version for iPhone. I actually communicated with the people (guy?) at MindDate quite a few times about a possible iPhone version but, in the end, they said it wasn’t going to happen. I looked for other options in the App Store but there didn’t seem to be anything good available so I decided to try and make something for myself.

I kind of copied the basic UI from Stackz which was okay for me but actually doesn’t seem to be terribly intuitive for new users. I’ve gotten complaints about it, usually along the lines of, “What am I supposed to do with this?” so I know I should probably try to come up with something better but, basically, it works. Those buttons are Leitner stacks, by the way. Honestly, I wasn’t very good at setting up UI stuff when I was getting started with iOS development and it took a long time to get working so, as long as it keeps working and unless someone sends me a really good idea, I’m probably just going to leave it.

kanji Flow is not a commercial product. It’s just something I do in my spare time, again, because it helps with my Japanese studies. I also include data from some open source or creative commons projects as well as featuring integration with a couple of free or free to use resources so I feel it’s best to make this app available to other users for free as well. If other people find it useful and it helps with their Japanese studies, that’s good enough for me.

I still use the app myself every day and I don’t foresee my Japanese studies ending anytime soon so I intend to keep updating the app to ensure it works with all future version of iOS. If other people have good ideas that aren’t too complicated or beyond my skills or available time resources to implement, I’m happy to add stuff to the app for anyone that asks. However, there are some features I’ve received requests for that I probably won’t ever add to the app. I’m going to address a couple of those features in the next post.

Happy Studies!