|Week 1 (June 18 - 26)
Saturday, the 18th and Sunday, the 19th:
I landed in Pittsburgh on Saturday afternoon. I was pretty impressed with the city within the first hour simply because a) the airport flyer is one of best airport public transport systems I've seen and b) there's a dinosaur in the airport. I spent most of the weekend getting over jetlag and trying to familiarize myself with the campus.
Monday, the 20th to Wednesday, the 22nd:
I moved into my apartment on Tuesday, which is a cute townhouse in Squirrel Hill, about a mile from campus. I'm living with two grad students in the HCI (Human Computer Interaction) department here, and another subletter, also a DMP intern.
The weekly graphics lab meeting was on Tuesday. I was able to get some sense of who was who in the graphics crowd, though I probably might not end up interacting with most of them. Jernej gave a talk on a rigid body and collision detection paper from this year's SIGGRAPH. I'm very familiar with research in graphics, so this gave me some idea on the topics of interest and methodology of a journal paper in the area.
My mentor was away all week at a conference, so one of her grad students, James, introduced me to the project on Monday, and to the code and image database I'd be working with. James and I also met with Sanjeev, the student who had written most part of this code last summer, to clear up my undertsanding of it.
I ran the program on some of the new textures in the database, as well as some of the old textures that had already been run and classified. More than anything else, this helped me navigate the functions, and get a good understanding of the program organization in relation to the processing and output. The output seemed a little haphazard, so I modified the code that produces the html results file to present it better, and had the group classification output more prominently. The input was restricted to taking bitmaps, which is inconvenient, so I extended it to allow jpegs as well.
I was reading the paper closely along with the code, and I noticed that the former assumes a threshold error score of 1 for separating the transformations symmetric to the original from those that aren't, while the code uses k-means. I also looked at the results from the run images, and noticed that a few were being misclassified because the k-means clustering was yielding different results from what was expected -- which the threshold separation seemed to be fine about. Sanjeev clarified that this was not usually the case -- a lot of the other images had scores that were either too high, or not distributed around 1 for some other reason. I decided to try out the thresholding anyway, just to see where it was failing and where it was not, so I put in this change, and left it running on all the images overnight.
Thursday, the 23rd to Friday, the 24th:
Looking at yesterday's results, the thresholding turned out to be not such a good idea... while some of the failed images certainly worked after this, a lot more of correctly classified ones failed. So I tried putting in tweaks based on the data's range, and found that the threshold separation works best for those with a lower variance.
I also took some time to walk around Oakland and Squirrel Hill, and take a look inside the Carnegie Museums, the libararies, and the Cathedral of Learning. I'll hopefully get a chance to do a more detailed exploration of all these later.
Week 2 (June 27 - July 1):
I got to meet my mentor, and we discussed the nature and possible goals of my work during the summer. We went over some of my initial ideas and which ones would work, and why the others would not.
Amanda, an intern at the electrical and computer engineering department had suggested a get-together of all the DMP interns and mentors over lunch, which is what we did on Tuesday, at the Carnegie Museum of Art. It was good to meet everyone and hear about their interests and research projects.
I decided to set aside the symmetry score improvement for a bit, and work on edge image comparison instead. As I'd noticed on the failed images, and as Sanjeev had pointed out in his presentation from last year too, a lot of the patterns have large areas of uniform coloring. Hence, the edges in the image are most significant in determining symmetry. The present sum of squared difference comparison ends up yielding uniformly good scores for these images since the overall match between an image under a rigid transformation and itself will be good.
I also examined the group identification function more carefully, and fixed a few glitches in the group conditions, like P1 being all scores below 1 or above 1.4 (as opposed to only the former), and the glide reflection group' conditions of not needing a reflection symmetry to have a glide reflection (the old code checked for both, which caused some of the P4G, PMG, PG and PGG failures.)
Since I haven't had much of a background in image processing, I checked out a couple of books on introductory image processing math and techniques to go over during the weekend.
Week 3 (July 4 - 8):
My housemates had a barbecue for the fourth of July, which was fun. Got to meet a lot of the HCI grad students, and see the fireworks from Schenley Park.
I started thinking about the edge-image comparison again. For some reason, it wasn't obvious to me that I could just perform an edge detection on the images. I started thinking more in terms of detecting the color changes, and segmenting the image accordingly -- which is similar to a standard edge detection procedure, excpet that the segmentation would probably be a harder problem than needed.
So after doing an elaborate sketch of a possible algorithm, and realizing that it wasn't getting anywhere, I wrote to Professor Liu about it, who suggested I just take the edge images. I implemented that, but felt a little uneasy of doing it solely on the edges as opposed to just weighting them more. Images of brick walls and such, for instance, have a few coloring features within their edges that might be significant for symmetry detection. But I wasn't sure how to assign the weights appropriately, so I just left it at that for the moment.
I was a little disappointed that not all -- maybe less than half of -- the 'rare-edge' images turned out right with the edge-comparison. I wasn't sure what was going wrong, though I suspected it had more to do with bad registration of the edges than the weighting problem.
Week 4 (July 11 - 16):
I stumbled across these research papers on hallucinations while I was doing a search for near-regular textures. Hallucination patterns are not lattices on a 2d x-y plane, but their shapes could be derived by mapping a texture in x-y space to another plane. At least, that's the first thing I thought about when looking at the hallucination images, after having stared at textures for so long! Then I remembered my mentor's paper on symmetry groups of patterns under affine transformations, and wondered what it would be like to examine group detection possibilities on patterns under non-linear transformations.
My mentor was pretty encouraging about the idea, so I started doing some more research on it. I first tried to figure our a way to apply arbitrary functions on matrices in MATLAB, and to find a possible transform back to a parallelogram lattice given a set of points on a lattice. This amounts to finding the function that these points trace on an x-y plane (though it will be only one of the possible transformations). So I wrote a couple of MATLAB functions: one that would create an m-file for a user-input function, and one that would find a possible inverse function, given a set of points and the function class (there are some straightforward inbuilt functions that make that easier).
However, the hard problem is figuring out the lattice points on the image. Since it does not have a parallelogrammatic lattice, and more importantly, since the tile do not have a uniform appearance thanks to distortions and local scaling due to a nonlinear transform, the autocorrelation and peak finding cannot be applied. James has been working on a similar problem -- finding lattices of patterns under local deformations, so I borrowed his code to look over.
The idea is to dynamically find the lattice vectors starting from a point in the image, till the whole is covered. However, it isn't all that efffective at the moment, and besides, I think there should be a method that takes advantage of the global deformation to find the lattice, rather than the step-by-step procedure that is necessary for local deformations.
Week 5 (July 18 - 24):
Monday, the 18th:
I realized that however effective the correlation is of the rotated and reflected images, and however sharp the thresholds are set, the clustering procedure for deciding good scores will still not be foolproof.
Now, in an ideal scenario, a well-matched pair of images would show a clearly large proportion of pixel-pixel similarity, and it would be trivial to divide the scores. But the correlation method at hand gives a range of scores depending on the image, from clearly separable to frustratingly concentrated. While Week 2's reasoning of using k-means only when the standary deviation is high, and thresholding otherwise works for those concent
So we need a relaxed margin for those images where no group can be detected. This involves taking the lowest 'bad' score, or dropping the highest 'good' score, and checking if this gives a group. Fortunately, the existing function's decision tree is based on the number of good results, so I bifurcated the group prediction function. Now, in case we find a set of results that fits no group of its order, we can call the decision procedure on a set with one more or one less 'good' score.
This amounts to stretching the decision boundaries to compensate for the score clustering error. I ran the previously failed images again, and 10 of the 50 or so unclassified groups gave a correct 'guess' classification, which I suppose is satisfactorily reasonable.
Tuesday, the 19th:
I tabulated yesterday's results, and also started exploring various other clustering techniques I could use instead. I'm wondering if I should keep going along the relaxed margin route, and use fuzzy clustering instead.
The advantage of this would be that instead of doing a strict conditional check for groups, we could associate groups probabilistically with the patterns. So for incorrectly classified images, we can get an idea of where on the 17-group spectrum the correlation scores lie. If they yield the correct group with a high probability, we can use that result, allowing that margin in score finding. However, if the correct group is detected with a lower probability, we have to conclude that there is some other problem: either the image has been hand-classified wrong, a bad lattice or tile is used, or the color information of the image is not sharp enough to allow comparison. On the other hand, even for correctly classified images, noting a probabilistic range of classifications might yield some interesting conclusions about transitions and hierarchies among wallpaer groups in graphics.
I decided to keep that in mind but work on it later, and turned my attention to James's code on testure irregularity correction. I wrote some functions that would adapt his extracted 'texton' into a median tile, for symmetry classification after straightening.
Wednesday, the 20th:
Back to clustering: I examined the results for images that were constantly failing, and adjusted the k-means and separation threshold a little. It's a little better, but there are still images where the decision is not correct. While I was at it, I looked through the glide reflection group image results, and realized that a lot of the scores were missing classification by a very small mark, so I relaxed the glide score threshold from (0.45, 0.55) to (0.4, 0.6)... fortunately, no false positives resulted from that.
Besides fuzzy clustering, I'm wondering if it would be a good idea to do some sort of learning-based classification -- that is, rather than dividing the scores, construct a training set of scores and associated (hand-classified) groups, and see if a decision tree can be constructed.
I just realized that in the old main function, one of the loops had, for some reason, an unnecessary nested one that made the program run n times on each file for a directory of size n. This was why the program had been running so slowly in the past -- it was iterating over the same files a great many times.
I ran all images overnight again after all tweaks, for what I hope is the final time for this stage.
Thursday, the 21st and Friday, the 22nd:
Tabulated the results from Wednesday. I e-mailed my mentor about the correlation score issues, and she suggested that I record the scores, to see if there's a connection between the group type and the distribution. I'm not sure if this is related to my learning idea, or if not, how exactly it could shed light on the shortcomings... but I decided to record it anyway, and work with it next week. She also suggested thinking of alternate ways of computing the correlation score to improve the relative distances, which I'll try to implement soon.
Meanwhile, I also searched online for images with possible global non-linear transformations to use for the later part of my work, and continued to work with the deformation straightening code, mainly fixing syntactic and technical bugs.
Saturday, the 23rd and Sunday, the 24th:
I mostly worked at straightening out the database with the results, and read up a little more on pattern recognition and neural networks, hoping to find something useful for the correlation problem. On the non-work side, I got to see the Pittsburgh Symphony Orchestra play the Lord of the Rings score, which was great, and caught a live rock show at the 'Quiet Storm Cafe', with some bands from Pittsburgh and New York City... it was some of the best live music I've heard in a while.
Week 6 (July 25 - 29):
Monday, the 25th:
I started looking at the individual score records, and realized that about half the failed classifications were simply a result of incorrect clustering due to ambiguous distributions. The others had a more basic problem with the symmetry score finding -- transformations with symmetries might show higher scores than those without and vice versa. So it might be necessary to tackle this problem first before the clustering one.
The present method just uses a sum of squared differences method. I began by implementing a chamfer matching since I was dissatisfied with the previous edge image scores anyway; I felt that even if it didn't work out as a general method, it would help in the area of edge image symmetry detection. Unfortunately, a naive chamfer matching involves computing the distance between each pixel from one image to every pixel on the other, which is obviously extremely time inefficient. But I let it run on a couple of images with previously unsatisfactory edge symmtry scores anyway, just to see what the results would look like. While it was running, I thought about ways to make it faster.
One possibility would be to compare every 2nd pixel on both distance images, thus increasing the speed by a factor of 4 (2 for each loop). This might fail if the edges are extremely sparse, and hence, more sensitive to approximations. Another way could be to compare the distance of each pixel with only a small group of neighbors of the corresponding one on the other image. However, this would need a metric a little different from the chamfer.
Tuesday, the 26th:
My mentor suggested trying correlation and mutual information for the symmetry detection. The correlation did not give better results than SSD, because it is essentially a pixel-to-pixel matching too.
Wednesday, the 27th to Friday, the 29th:
The score distributions are worth looking through to give an idea about the image and its relation to the group anyway, so I organized them into a database, trying to find connections between the distributions and the image's color and symmetry properties.
I also looked through some of the failed lattices and their peakmaps. There were some peakmaps which looked intuitively regular enough to suggest the correct lattice by a connect-the-dots viewing. The algorithm used does not make use of the general connecting of points as much as an iterative try and match method. Playing around with a few peakmaps, I sketched out a connect the dots method that, in short, counts all the lines through three or more colinear points, draws a lattice of the two most occuring lines, and comes up with a lengths of the vectors by counting the number of points that coinciding with each segment length.
Week 7 (July 31 - August 4):
These five days were dedicated to SIGGRAPH, which was great. I might write a detailed summary of it later when I'm less tired (just got back!). For now, just going to say how overwhelming and eye-opening it all was. I loved how artists and scientists, designers and engineers were all together in the same place, talking about and showing their work, and making the fusion between art and technology seem so natural. And seeing how there are so many aspects of computer graphics research, with so much accomplished and so much more left to do was fascinating. I've only been back a few hours, and I already miss the aesthetic and informational inundation of the conference.
Week 8 (August 5 - 14):
I started implementing the lattice detection algorithm I had planned the week before SIGGRAPH, and realized that it is easier to visualize than put into code. It requires far too many calculations and checks -- line segment division, slope matching, counting of a number of sets. Even if I could get it implemented efficiently, the stack of function files in front of me indicated that the time complexity might not be the best.
So I tried finding a more straightforward approach to the problem, keeping the same general idea. The nearest neighboring peak approach is along the same philosophical lines, but is easier to write and is far more time efficient. My original plan's motivation was in maximizing the match between the optimal vectors and the peaks by counting through. This is almost equivalent to counting the vector pairs for each peak (i.e, for every peak, store the vectors to the two nearest unseen peaks), and finding the pairs with the maximum occurences. It can be shown* that this works if half or more of the correct peaks correspond with those in the map, and less than half of the peaks of the total in the map do not.
This translates a little more elegantly into code than the previous plan! It also works pretty well in practice, since the requirements for a good peak map are a lot less stringent than that of Lin et al's Hough-based algorithm -- a few missing peaks do not affect the results as drastically. At the same time, it seems to be as good at ignoring wrong peaks.
I met with Professor Liu to go over this, as well as the symmetry detection procedures I was trying.
*see final paper
Week 9 (August 15 - 21):
We (the five DMP interns in the lab) gave a short presentations of our work at the weekly lab meeting on Tuesday. I think we were all a little nervous (I certainly was) at talking about what we did all summer to the professors and grad students. I felt like there'd be a flaw someone would find, or that it would appear I'd done too little work! But it was also nice to let everyone know what we were working on, and see what the others had been doing. A couple of people also had pointers to recent graphics work that could be of use, especially for the color/intensity problems.
I photographed the rugs from the book, but it was hard to get the lighting and resolution perfect. We decided that it would be preferable to scan them. Prof. Liu and I met with Dr. Eddy, who gave us some interesting information about rug analysis.
Oriental rugs adhere strictly to the rules specified by the regions that they are woven in. These rules most often involve a the shapes of motifs, their arrangements, the borders of the rug, and to seome extent, the colors of the yarn used and the type of weaving. As a result, a rug's geographic birthplace can be identified, if it is faithful enough to the regional patterns.
Dr. Eddy's goal is to write a program that will do this identification. That would involve identification and analysis of the motif, analysis and separation of the yarn colors, and so on. He's been trying to identify the unique colors in the rug, but so far, a histogram of the RGB values not only yields a range across whole color cube, but does not have any discernible peaks that could correspond to the image's colors. So we need to identify why this is happening, and what the correct approach to the problem could be.
Meanwhile, my lab computer started powering off me, and logging off my access to MATLAB, so I couldn't get much programming done. So I did a little organization of the code to make it more readable, and started trying to find a bound for the new lattice detection algorithm to work, in terms of correct peaks. We'd also realized by now that failed lattices are due to three different problems rather than one -- an bad autocorrelation map, a bad peak map given a reasonable looking autocorrelation (that is, too few correct peaks for the algorithm mentioned to work), and a bad lattice given a fair peak map. So I subclassified the 'failed lattice' images in the database into one of these three problems.
I found the SIGGRAPH '05 paper that was suggested at the meeting for better RGB to gray conversion: Color2Gray. It looks like a clever and useful algorithm, and the demonstrated results are promising. Guess I'll try using the source code in the program, in lieu of the inbuilt rgb2gray function.
I tried photographing the images again, but the resolution was just not working, so my mentor gave me access to the color scanner over the weekend. I spent most of Sunday evening scanning the rugs from the book, which took a while because I was trying to eliminate shading and haze effects that come from scanning a book on a flatbed. I was pretty satisfied with images at the end of it, so it seemed worth the effort -- the resolution was high, and the color information looked good.
Tried getting started on my final report as well.
Week 10 (August 22 - 28):
I figured out a lower bound and proof for the lattice-from-peaks algorithm. My computer was still a little flaky, and I was still logged out of MATLAB, so I occupied myself trimming and adjusting the rug images. I also started taking a few more real world texture photographs for the database.
I met with my mentor on Tuesday, and we discussed what I need to wrap up, it being my last week here. We decided that one task would be to report all the problems in the algorithm, and figure out the causes for each of them. She also asked me to take a few more texture photographs, and hand-classify the ones in the database that weren't, and of course, run the program on them (once I got my computer working again!)
We met with Bill Eddy again. Over the weekend, he wrote a thresholding convex hull program that might give a clearer indication of the dominant colors in the images. This may be a step towards pinpointing the exact yarn colors, but right now, the histogram only gives a fuzzy range for each color. I might be working with him remotely, over the next semester, on the project, which is an exciting idea.
The computer was fixed only on Friday, so that left me very little time to wind up the programming part. I started by downloading the color2gray code, but it turns out it can't really be used yet since it is expensive -- it runs out of memory if the target gray image is to be more than 50 pixels wide! I don't know if it's feasible to optimize the algorithm since I haven't read the paper that thoroughly yet. I hope there is, or the paper will be pretty useless unless you have a supercomputer at hand!
Week 11 (August 29 - September 3):
I flew back to Boston on Monday, but since I still had a couple of days before classes started, I got a little more work done.
On the flight, I realized something that should have occurred to me (or, for that matter, anyone who's tried to do lattice detection!) much earlier. An emphasis of this project is to make the symmetry group identification as human-independent and self-sufficiently dynamic as possible. One of the easiest stages of automatic verification is the lattice detection procedure. The old program implements something like one by comparing the displacement vector intersections with the autocorrelation. However, the most intuitive way that a lattice can be judged (by human or machine) is by comparing the regions of the image cut out by the displacement vectors. If all the regions are the same, the lattice is correct!
So I got to work writing code that would compare one of the central tiles with the four tiles laterally adjacent. (There's no reason to pick these tiles other than it makes it less likely to find the adjoining tiles to be out of range of the image.) If the comparison is good, it retains the current lattice. If it isn't, it looks for another one.
I then integrated the old lattice finding function with mine using this verification as an indicator. The program now first runs the old code if the number of peaks <=150 (since it runs very slowly for larger peak sets -- complexity being the square of the number of peaks). It then verifies the lattice generated. If it is inferred to be incorrect -- or if the number of peaks was too large to begin with -- it goes to the new algorithm.
Now I wanted to make this similar to the old idea in that it takes progressively larger sets of peaks too. So I worked the verifier into the code so it runs a check after the lattice is generated for every set of peaks.
Besides this, I worked on the final paper, and the general organization and summary of the ideas and results.
I kind of miss CMU and the summer now.