Mathematicians beat pornographers!

For years, whenever I ran a web search for something involving LaTeX, I would throw the word “typesetting” into the search terms in order to screen out the p-o-r-n. I just checked, and this is no longer necessary: Even without safesearch, the first three pages of google hits on “latex” contain only one allusion to the material’s use in fetish wear — and only three references to the rubber material at all.

It is great that google now thinks I am more likely to care about quality typesetting than about rubber clad women. But I wonder whether this is smart behavior on google’s part. It seems to me that a really smart search engine would realize that people searching for “latex” fall into three or four distinct camps — mathematicians, materials scientists, fetishists, and perhaps some group I’m not thinking of — and offer me a few hits focused on each group. And that, in turn, made me wonder how I would design an algorithm to do such clustering. Any ideas?


23 thoughts on “Mathematicians beat pornographers!

  1. 1. Do you get the same behavior after flushing Google cookies? I suspect the results are simply being tailored to you.

    2. I saw a talk by Jon Kleinberg many years ago about how to do clustering, using the first few eigenvectors of the Laplacian of the induced subgraph on the first 1000 or so Google hits. In general, the first few eigenvectors assign close values to tightly linked vertices in a graph. The idea is that Jaguar car sites point to other ones, jaguar big cat sites to others, etc.

  2. There definitely is some tailoring – my advisor was very proud one day when he discovered his website was the first hit when you just google his first name. However, when I tried it on my computer, he was still second. (I still get him below the other guy.) I suppose Google still hasn’t learned enough about me to realize I’m more likely to be searching for a philosopher of probability than a lover of Ayn Rand.

  3. I tried it on a clean browser, so unless google is tayloring to my IP, then David’s right. I get math typesetting hits for everything except the “video results” and the “sponsored links.”

  4. When I saw the title of this post on the RSS feed, I thought it was going to be another one of those job satisfaction surveys.

    Only slightly off topic: Do people here have any opinions about the best way to make graphics for latex? I used to just handcode the xy, but that’s getting to be a bit too much effort. I’ve recently discovered (i.e., yesterday) a program called Inkscape, which translates freehand drawings into vector graphics (which you then edit), and also does a nice job of outputting PStricks code. Anyone got other recommendations?

  5. Forgot to add, there’s an extension of Inkscape — called textext — which allows you to insert latex objects into the pictures you make.

  6. A paper of Cheng, Kannan, Vempala, and Wang (“A divide-and-merge methodology for clustering”) describes a search engine implemented based on a spectral clustering algorithm.

    The engine, eigencluster, is available at , though I’m not sure if it is still being maintained or still in the process of being improved at this point.

  7. AJ: Masahico and I have been using xfig for years. Its file sizes are small, the splines are easy to use, and layers are great for creating knot crossings: arc of slope 1 at level 50, arc of slope -1 at level 60, small white filled circle with no boundary at level 55.

    When using xfig, you want to stay on the coarsest lattice for as long as possible.
    xfig exports to eps, ps, jpg, etc. It is unix/linux/mac os x based. To include graphics in latex use:
    in the preamble and
    something like:

    \caption{An arc of double points}\label{double}

    in the body. Depending on your system, the file name might have to be filename.eps, filename.pdf, or filename.jpg

    We started using this before mime compliant attachments were common, so we could include the fig files (ascii) in the body of an email, strip the header, and open to edit the figure.

  8. “arc of slope 1 at level 50, arc of slope -1 at level 60, small white filled circle with no boundary at level 55.”

    A tip which I learned from Frank Sottile — instead of a white circle, use a white arc of thickness 3, following the same path as the top arc. This never looks worse and sometimes looks significantly better, as it automatically cuts the lower arc at exactly the right place.

    I also use xfig, combined with PSTricks to get myself the fonts I want. I’m not as happy with it as Scott Carter — it’s very good for figures made up of polygons and circles, but I don’t like it for splines or when I want to sketch something freehand without indicating a particular shape.

    I also wish I could define a point as the intersection of two existing shapes, and have it move when I moved the other shapes. (For example, draw two crossing circles and the line segment joining their points of intersection, then have the whole diagram adjust if I move a circle.) Having to do this sort of adjustment by hand slows down figure editing a lot. But I haven’t found anything I like better.

  9. I think “latex -rubber” works. The might sign means you filter out anything with that term. It works… except for a link about a polk-dotted dress. One of the flaws in PageRank is that it does not pay attention to *who* is doing the searching. This is impossible of you consider every use of the search engine as independent.

    Probably using cookies and other spyware you can define a bipartite graph of users and web pages. Two web pages are connected through a person if he clicks on both of them. Two people are correlated if they click on the same web site.

    Another idea is trying to define the PageRank poset instead of trying to well-order them. Or maybe in real life everything is well-ordered.

  10. AJ: I still just “code” my diagrams, but I’ve been very happy with tikz. All the diagrams I did for the TQFT via Planar Algebras posts were done in tikz with relatively minimal effort.

    Just like latex is “smart” and tries to format your code in a way that it decides looks the prettiest, tikz is also smart.

    If you just want to draw a curve through some points you can use some code like:

    \tikz \draw plot [smooth] coordinates {(0,0) (1,1) (2,0) (3,1) (2,1) (10:2cm)};

    The manual is well written and comprehensive, and if you want to place a point at the intersection of curves, eg two circles as David suggests, then this is easy and the point goes along for the ride as you adjust the circles.

    I can show you some more examples when I get back to the office next week.

    Also, I just learned, quite pleasantly, that tikz is super compatible with beamer. So if you ever want to turn your figure into a dynamic slide presentation you can!

  11. Re 14: The thick white line below is very clever!
    I have also used illustrator for more complicated things. But
    (a) it is expensive; (b) its file sizes get out of hand quickly.
    The arXiv does not like large files.
    My draft of the sphere eversion book is well over 30Megs. But it has LOTS of pictures.

    Other problems with xfig is the lack of transparency layers.

  12. I didn’t say this outright, but I’m actually very pleased with Inkscape. It’s open source, and situated about midway between xfig and illustrator in terms of sophistication. I’m very pleased with its ability to turn my freehand drawings into PSTricked latex code.

Comments are closed.