nickbaxter ([info]nickbaxter) wrote,
@ 2009-06-28 13:52:00
Previous Entry  Add to memories!  Tell a Friend  Next Entry
Entry tags:uspc wpc puzzles nikoli

2009 USPC Notes


The results of the 2009 Google U.S. Puzzle Championship are now posted on the US Team site. And for the forth year in a row,[info]motris had the top score. For more on the results, check out the championship web site.

Here I would like to give insight on only the test and the puzzles.

Battleships
Puzzle #1 has always been an easy puzzle, designed to fill time while the rest of the test is printing. I finally figured out that adding white space will cause the page to print faster, so I moved puzzle #2 to the second page. This (and the very tight formatting used at WSC4) also got me thinking about the overall pagination of the test. For the hardest problems (20-23), I separated them one per page to allow more space for notes, similar to a typical WPC one-puzzle-per-page format.

Nikoli
I have been in contact with Nikoli staff since 2000, but never seriously considered approaching them to contribute to the USPC until this year. Instead, in 2002 I sent email to some of my regular contributors, prompting them to write their own versions of Masyu, Yajilin, and Corral. After a successful partnership with Nikoli to provide the 2008 US Sudoku Championship puzzles, the time was finally right to get them involved with the USPC. My hope is that resulting cross-promotion will increase English-language participation at Nikoli.com, and induce implementation of an English language Nikoli forum.

(I had occasion to visit the Nioki office in Tokyo in December. Nearby Ryogoku subway station does in fact have Sudoku-inspired artwork!)

I wanted the Sudoku puzzle to be a modest #2 puzzle. But I wanted the others to be more challenging. To get the right level, Yajilin and Masyu needed to be Giant sized. Some don't like the size, but in order to get the "Nikoli solving experience", I think you'll just have to get used to it!

KenKen®
This Sudoku variation has become surprisingly popular. [info]canadianpuzzler submitted an interesting diagramless KenKen variation (that should appear on LJ soon), and I felt compelled to include a standard version as a baseline. Test solving revealed a slight flaw in Craig's puzzle, and I decided not to use it, replacing it with much harder puzzle.

I decided to replace the generic KenKen with something a little more interesting; fortunately Bob Fuhrer at Nextoy anticipated my request with a page full of operationless puzzles from which I got to choose the best.

Writer's Block
To better ensure that a simple one-word answer wouldn't get cooked, I asked [info]canadianpuzzler to add three more non-answer words. I wanted JD SALINGER to be the extra 8-letter answer, but the initials created an unreasonable ambiguity.

Coordinate Pairs
This had a similar feel to the Dot Triangles from last year, but certainly one fewer dimension made the overall combinations a bit easier to handle. Still, this is still pretty much a trial-and-error type puzzle, and might not have much of a future on the USPC.

Triangular Skyscrapers
I spotted this at the 24 Marathon, and asked Aziz to make one for this. Anyone who has solved this knows that the triangle orientation constraint is key, and the more efficiently you can use this along with the Skyscraper rules, the faster you will solve it. The original version included a piece inventory, similar to what you would see in the Domino puzzle. But this really wasn't helpful as is. More useful (at least to me) was grouping the inventory by which numbers were used for each orientation. I decided not to include any piece inventory diagrams, so that solvers would have to discover for themselves what would be most useful.

Window Pain
I suppose the name says it all! This was inspired by a design I saw while reviewing Scott Kim's 2010 Page-a-Day Calendar. [info]motris was the first to sniff out my mock embarrassment in creating such an obnoxious puzzle; [info]devjoe triggered an amusing thread. But experienced solvers should know by now that most counting puzzles on the USPC have some reasonable counting-without-counting technique. The official statistics show that at least 50% of the solvers still haven't figured out that "counting" is rarely a good technique.

The intended technique was to look at the spacing of the vertical and horizontal lines separately, tabulating how many gaps of 1-11 there were in each direction, then multiplying corresponding values and adding them up. Not too bad.

[info]onigame used an alternate approach, with slightly more work, but again not requiring that each individual square be counted. He looked at the upper-left to lower-right diagonals. All squares must have exactly two vertices on one of these 23 diagonals. So for each diagonal, count the number of lattice crossings, including points on the edge (the diagonal starting at the upper-left of the grid has 6 such points), and compute n(n-1)/2. Sum all these values to get the answer!

Masyu
This year's answer key format was a step backwards. Next time I'll go to a fixed set of node that the solver has to put into the correct order. And I'll try to use this or similar techniques for similar route puzzles (but probably not Fences and Corral).

Lucky Sevens
The initial version of this puzzle had FUCHSIA as the key unused word. Fortunately one of the test solvers had searched for all words that could cross seven other words. Not only did FUCHSIA not satisfy this constraint, but because of the C he quickly discovered that it could not possibly appear anywhere in the grid! Shawn replaced it with TERRACE, with was much better.

Sweet Sixteen
I was surprised that only 39 people even attempted an answer this problem; the points shouldn't have been intimidating; perhaps the late position was. It actually wasn't much trouble. The sum of 2-16 is 135, so the bottom three rows total 117, and the common row sum is 39. The corner 2-edge triangles must all have the same total, 18, which gives 81 as the sum of the central 3-edge triangle (and forcing the values 11-16). Using this and the row sum, you finally find that the central 2-edge triangle must contain 14-16, and the rest is easy.

I named the puzzle to suggest a square arrangement (instead of triangular) to anyone trying to plan ahead. And yes, I was tempted to make this a 16 point problem. But then I thought better of it for fear of even the smallest chance that the rankings could be impacted by such a whimsical change.

Di-Agony
Another surprise: only four people solved this. It's really a very pretty puzzle, but easy to break if you're not careful.

Four Square
Another design I found on the 24-Hour Marathon. I liked the way the quadrants interacted, and that you had to continually shift focus to make progress. I think this one received the most complements from the solvers.

Inside/Outside Corral
This was not in the initial play test. The first test took about the same time as last year. And since there was no nasty wild card (i.e. Point Triangle), I decided that one more tough puzzle was needed. I had this one waiting for three years for just such an occasion, and I think if finished off the test perfectly.



Overall Composition
Overall the test had the desired balance between puzzle types. Magic Puzzle'Rs was overvalued (it really should have been 15 points), so Lucky Sevens was the only bona fide non-logic grid worth over 15 points. This was not intentional, since I like to have a tough word puzzle as one of the final two. Just unlucky this year, I suppose.

Web Site Improvement
I've hired an MIT EECS major as a summer intern to write a new answer entry page. I hope to have an answer page that accepts one answer at a time, with format validation, a clear button, and have the answer page refresh with the previously submitted answers, giving immediate confirmation that the answers have been received.
 





(7 comments) - (Post a new comment)


[info]devjoe
2009-06-28 10:51 pm UTC (link)
I have adopted the strategy of picking a puzzle I can solve while viewing it on screen while the test is printing, because of bad experiences with past USPCs that inexplicably took 10 minutes or more to print when other documents of similar length and apparent complexity would print in 1-2 minutes. But the last two years the whole test has printed quickly, so maybe this is no longer necessary. In any case, as I wrote in my own blog, I did the square counting first, and of course, it was using the technique you described.

It was good to see the Nikoli puzzles. I have been practicing on Nikoli puzzles for years, even before any of the Nikoli original puzzle types had appeared in the USPC, partly just for the general logic puzzle practice and partly knowing that their popularity meant they were destined to appear. Often I was solving their Giants which are considerably larger than the ones that appear in this year's test, such as 30x45 Masyu (but that size is as inappropriate as the small size of earlier Nikoli puzzles; an hour would be a very fast time for some of those!)

I've been doing puzzles like Ken's Sweet Sixteen on his web site for years. This was a quickie for me.

Only four solvers for Di-Agony. Out of... over 500 entrants. Amazing. I would comment more, but it seems like so many people didn't solve it that I shouldn't spoil it.

I've gotten used to the way the USPC answer page works over many years (in particular that clicking Back after submitting in Firefox brings me back to the page with all my previous entries still there, something that many web forms don't do), but there's certainly room for improvements of the types you describe so that I should not have to click Back at all.

(Reply to this)


[info]cyrebjr
2009-06-29 12:09 am UTC (link)
For "Window Pain," I used the tabulating method. But after the test, I saw that I had written down the wrong number of side-1 squares-- 24 instead of 36-- a particularly obnoxious mistake. I was too busy "counting" larger squares and adding 11 numbers to notice that one of the larger ones of smaller ones was wrong.

Maybe puzzles like these could be graded on a sliding scale, as with difference-spotting picture puzzles. You could count off a point or so for each missed or extra shape, down to -5. (My score would have totaled 93 instead of 90, so nobody has to worry about big threats.)

(Reply to this) (Thread)


[info]motris
2009-06-29 01:07 am UTC (link)
I've argued (unsuccessfully) for years that the -5 penalty does not make sense on a lot of puzzles, particularly non-guessable ones. I'm unlikely to randomly type 813456972, 342589176 and hit the sudoku answer (so I bet solvers don't either). Maybe -1 points is a reasonable standard to not have complete answer spamming, or a sliding scale of 0, -5, -10, -25, -100, ..., for the errors a person makes so that there is absolutely no benefit to answer spamming.

Counting puzzles though are always going to be in the potentially guessable range of puzzles and so I think a penalty must exist for them. It would be cool to give an increasing penalty where wild guesses can be segregated out, but in practice I bet this is hard as would be the case this year for Window Pain where a single error is not being off by 1 but being off by a product of some number of things. In past years, as in 2005 and 2007 where I got -5*2 points for two solutions that were 1 short, I'd have been happier with a smaller penalty since I had tried and spending time and being wrong is penalty enough. For Nick, to not have an "arbitrary" scaling affect the results, a flat penalty (for counting puzzles) is logistically best.

So, my USPC improvement is to have puzzles that say "15 points, -1 point penalty" and "20 points, -10 point penalty" where not only the value of the puzzles, but the cost of a mistake, are more appropriate to the kinds of errors solvers will make. The worst experience almost every solver has is thinking they solved a puzzle worth 10 or 15 points and then realizing not only do they lose those points, but they lose 50% or 33% of the value of the puzzle in addition just for trying.

(Reply to this) (Parent)(Thread)


[info]motris
2009-06-29 01:12 am UTC (link)
Playing devil's advocate to my own position, the presence of the -5 point penalty has made me always double-check my entries immediately. I'm not sure if I would act differently if there were no penalty, as I'm still aiming for a complete test each year, but maybe some solvers do try to enter their solutions better because of the penalty which helps both them and the grader at the end of the day.

(Reply to this) (Parent)

web site improvement
(Anonymous)
2009-06-29 01:19 am UTC (link)
With many sites attempting new technologies to "improve"
user experience, I have always been glad to see that
wpc.puzzles.com had a straightforward static-like display.
Although answer validation (but not confirmation!) may
be desirable by some, I lobby for your keeping the
original submission format. No matter how well it may
be designed, any substantial change in answer submission
will change the test, and I do not see anything new that
will not take away from the test.

Congrats on another well run competition. Kudos especially
for getting results out by today.

Gerhard Paseman, 2009.06.28

(Reply to this) (Thread)

Re: web site improvement
[info]nickbaxter
2009-06-29 04:31 am UTC (link)
I am certainly comfortable not making changes, but the forms are currently implemented with 10 year old web technology (MS FrontPage) that is no longer supported. So I don't have much choice but to change. However, I will endeavour to keep the experience as similar as possible.

A solver might notice the difference between a computer-generated and a hand-crafted puzzle; similarly one might also notice the difference between computer-scoring and hand-scoring! I know people appreciate for former, so for the USPC, I review and judge every answer that is submitted.

Prior to this review, all answers are canonicalized to reduce variations, and collated. For each puzzle, I'll get anywhere from 5 to 50 distinct answers. This lets be spot negative trends (what types of mistakes are common, etc.), and allow for unexpected variations of correct answers.

Implementing a smarter answer template will dramatically reduce the number of erroneous answers by enforcing stricter formatting, and reducing the review process by maybe 50%. Less work for me means faster results for you. In fact, I hope to automatically prioritize real-time grading of the top papers, and could conceivably have the top 10 results within minutes of test completion.

The merging of separate entries from the same person is a big time sink (since I have to make sure that I catch answer erasures), and accounts for about half of the entire scoring cycle. The new entry design will eliminate that step completely, which will be the biggest win.

(Reply to this) (Parent)

C Notes
[info]volodia_p
2009-06-29 02:37 pm UTC (link)
I wonder how the title "C Notes" appeared instead of original "Hundred" :) Was it supposed to mean something special?

(Reply to this)


(7 comments) - (Post a new comment)

Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…