Monthly Archives: March 2012

Factorize, More Boxes

Boxes and arrows, I know, it doesn’t look like a program yet. Still, it all is going somewhere, I promise. Let’s take a minute and talk about how factorization works. It’s pretty straightforward: either a number is evenly divisible by some other number (a factor) or it isn’t. Let’s look at an example: 48. What numbers go into 48? Well, I always think of it as 6 times 8, but it’s also 4 times 12. I’m the one with the keyboard, so let’s do 6 times 8. Now, let’s look at 6. Is it evenly divisible by anything? Sure, 2 and 3. How about them? Nope, they are not evenly divisible into smaller integer factors. That’s what it means to be prime, and two and three are the first two prime numbers. Eight, on the other side, divides down into twos. Here’s a tree, where each number gets split down into factors. When a factor can’t be split any more, that factor is prime:

The way the program is going to work (refer back to the previous drawing) is to start out with a number to factorize — let’s repeat this example and use 48 — and a list of primes. It’s going to start at the small end of the list of primes and keep trying until it runs out of primes or until it divides the number all the way down to primes. Here’s the tree the program would draw:

Here, the program would try the first prime, two. That works, so it would divide 48 by 2 and get 12. It would start over with 12 and see if two goes into 12. It does, and it would try again on 6. That works, too, and it would try again to divide 2 into 3. That doesn’t go evenly, so it would go to the next prime. The next prime is 3, which does go into 3 evenly. 3 divided by 3 is 1, so the program would know that it was done. This kind of drawing is called a tree (it’s upside down, don’t worry about it, but 48 is the root and all the primes are leaves). I’ve colored the primes green to highlight how they’re not being divided, and because they’re the leaves.

This process is pretty darned simple to explain. The only thing missing is, gosh, a list of prime numbers! It’s getting late and once again, I’m going to have to leave that for the next time. Here’s a bit more detail on the flowchart, though. This is the detail for the “let p be the next prime number” box. The first time we run through, we want to start at the circle labeled, “A,” but on all the following times, we want to come in at the circle marked, “B.”

Why You Gotta Hate?

For the past three weeks I’ve been working on a Bugzilla system. Specifically, working on fixing small bugs in the UI code of a customized Bugzilla installation. This system has been highly customized, with lots of new Javascript as well as a lot of server-side extensions to customize the process of bug entry and workflow. Bugzilla, in case you didn’t know, is written mostly in Perl. There’s some Javascript and the pages are all rendered with Template Toolkit, so basically, it’s Perl. The last time I worked with Perl, I wrote a couple of scripts to automate some image conversion on Windows machines. At that time, I didn’t need an IDE; Notepad was plenty for the needs of a program that used a single file. Bugzilla is huge. It has hundreds of source files and none of them are self-contained. Every file refers to at least one other file and most refer to several.

Bugzilla is written in Perl but it’s object-oriented Perl. After one day of using BBEdit (a swell text editor that I’ve used on and off for decades) to try to navigate around, I found myself really wishing for an IDE. There exist projects that claim to be Perl IDEs. There’s a Perl plugin for Eclipse (that is slow and just functional enough to be tantalizing, but all it really does is syntax coloring, and not even very well). There’s Padre, which looks swell but doesn’t do even as well as BBEdit for editing and the other stuff doesn’t work. Eventually I tried IDEA, since I have a license for it. It does syntax coloring and multi-file search and all that stuff that BBEdit does, while adding the knowledge that everything under a given folder is part of the project and not to worry about stuff outside the folder. It doesn’t really have Perl support so I still can’t do the really powerful stuff an IDE really lets you do. I still wish for a tool that lets me put the caret in a symbol and then, with a keystroke, go to the declaration of that symbol, wherever that may be within the project.

Other tools that my group support are written in Java, Ruby, and Groovy. It’s enough to make me cry, just a little bit. I’ve been going through Ruby and Groovy tutorials and, okay, I get the amazing power of Rails/Grails. That right there is 100% awesome. If you’ve gotta spin up a prototype web service in a hurry, man, Ruby on Rails is hard to beat. Looking at the language features, though, and the language comparisons (try Googling for “compare ruby java” or “groovy language features”) I start feeling tired. Why do so many people make such a big deal about closures? Are there really that many programmers developing self-modifying systems? Seriously, Lisp is an amazing language that is super powerful and awesome, but most of its awesomeness would go to waste if you’re writing a blog or an online bill paying site.

I’m starting to wonder why so many people seem to be allergic to languages where you can know, just by looking at the declaration of a method, what the method expects the parameters to be. Perl doesn’t even declare parameters, Ruby and Groovy do but the type is optional at best and usually absent. When my little chunk of code gets an object and I want to do something with it, I start wondering, “What kind of thing is this object? What can it do? What kinds of state does it have? Is it even reasonable for me to try to do anything with it?” Some might argue that I (and my code) shouldn’t wonder these things. It should be the job of the caller who’s providing the object to be sure that the object is appropriate for my code. But that’s just sweeping the problem under the rug: how in the world can the caller know, since my method doesn’t advertise parameters? It’s the same question, really.

The big advantage that I keep seeing touted on tutorial websites is that these scripting languages require less typing than stricter languages like Java or C++. You know what? I’ve got a solution for that: learn to type. Holy cow, it’s not like memory is that expensive, that the difference between having braces and no braces in the source code is gonna kill you. Source code should be readable by humans. Okay, open question to all you Perl/Python/Ruby/Groovy/whatever-the-fuck-scripty-language guys out there: why do you hate knowing what type of thing a variable is? Why do you hate tools that tell the programmer what’s going on? Why do you think that the best source code editor is ed? Jeez, guys, programming is fun, why do you have to make it so tedious?

Factorization: Some Drawings

So here’s the very high level flowchart for the factorization program. The key insight here is that the things the program does are grouped together and put inside pretty inclusive boxes. The lower-level flowcharts will start to unpack these boxes. The really interesting thing about this is that for any box we don’t really know how to implement, we can just put in a stub that pretends to do the work. Like, “Oh, I don’t know how to do prime factorization so I’m just going to print the number 5 and move on.” Computer programming is all like this: break the problem down into smaller and smaller chunks until each chunk is something you know how to tell a computer to do. Continue reading

Programming Assignment: Prime Factorization

Last night the Badb had to find the prime factorization of a number. Thrill moment! This is exactly the problem I once wrote a computer program to solve and wound up entering that program in the school science fair. That was pretty cool.

Back when I had to solve this problem, my computer had a pretty severe limitation within its BASIC interpreter: the highest integer value it could deal with was 65,535. The clock speed on the CPU was pretty darned slow, as well. These meant that any program I wrote had to be pretty efficient if I wanted to be able to interact with it — I could feel myself aging while I waited for the computer to count to 100 — and I didn’t have to worry about really huge data structures. When the biggest array was guaranteed to hold fewer than 100,000 items, FOR..NEXT loops were not very scary.

Okay, enough reminiscing. Here’s the assignment: write a BASIC program that will accept a user’s input of a positive integer greater than 1 and less than 65,536. The program will output the prime factors of the number. The program must guard against empty, non-numeric, and out-of-range values. If the user inputs a value that does not meet the acceptance criteria then the program must print out an error message and prompt once again for input. After writing out the prime factorization of the number, the program should ask the user whether to factor another number or to exit, and then behave appropriately.

Here are some hints:

  • a sieve of Eratosthenes would probably be a good place to start to get a bunch of prime numbers
  • Any factor of a number that is greater than the square root of that number is, by necessity, part of a pair of factors — and the other half of that pair is less than the square root of the number.
  • The modulus operator (mod) might be helpful.
  • Try using pencil and paper to draw a flowchart for how to solve the problem. Start with boxes that represent big steps (“get user input”, “compute prime numbers”, etc.) and then explode the boxes until each box represents a line of actual code. This can seem kind of extreme, but it totally works and it keeps you from having to remember everything at once.

I’m gonna make some time to try to write this program myself, but it’d be super cool if any of the three of you people reading this could beat me to it. I’d be really proud!

My New Commute

Highway 85 is, unless you’re a carpool, unusable during commute hours. I’ve been trying different ways to get to NASA for the past week and so far the best I’ve been able to come up with is 70 minutes. Without freeways, there are basically two routes: mountains north and a short run east across Palo Alto / Mountain View, or mountains northeast to Los Gatos and then surface streets to Mountain View. The second option seems to run about 70 minutes both ways, while the first is about 75 (there are three schools on Page Mill and traffic is a bit slower). Five minutes is pretty much a wash. The valley route, though, has a little bit less mountain twistiness and I can get slightly better mileage on the flats even in slow commute traffic. I can drive less aggressively and be all calm and stuff, but there’s just no way around gravity when you’re going uphill.

Now I’ll start keeping track of what stores I’m passing to see if I can combine any errands with my commute.