├── intro_web_data ├── arts ├── classify.py ├── delicious_import.py ├── distance_demo.py ├── links.csv ├── nytimes_pull.py ├── rec.py ├── sports ├── stopwords.txt └── tag_clustering.py └── solving_problems ├── access_log.txt ├── bloom_filter.py ├── decision_tree_regression.py ├── descriptions.csv ├── flat.txt ├── kmeans_descriptions.py ├── liked_decision_tree.py ├── map_reduce.py ├── path_distribution.txt ├── pca.py ├── scripts ├── bar_chart.py ├── histogram.py └── ninety_five_percent.py ├── simhashes.py ├── thingiverse_all_names.csv ├── thingiverse_liked_objects.csv ├── thingiverse_liked_objects_1k.csv ├── thingiverse_tree.dot └── thingiverse_tree.png /intro_web_data/arts: -------------------------------------------------------------------------------- 1 | LONDON -- An opera about Anna Nicole Smith --the American sex symbol, Playboy Playmate, hapless model, laughable actress and fortune-hunting wife of a billionaire 62 years her senior? Commissioned by, no less, the Royal Opera at Covent Garden? When the plans were announced it sounded like a dubious idea, a tawdry way for a major opera house to look 2 | THIS is the last weekend to see the New York production of “The Merchant of Venice” starring an Academy Award winner in one of Shakespeare ’s greatest roles, Shylock the moneylender. That is, until next weekend. Al Pacino (who won the best-actor Oscar for “Scent of a Woman”) wraps up his four-month run as Shylock on 3 | Theater Approximate running times are in parentheses. Theaters are in Manhattan unless otherwise noted. Full reviews of current shows, additional listings, showtimes and tickets: nytimes.com/theater . Previews and Openings ‘The Book of Mormon’ Previews start on Thursday. Opens on March 24. The “South Park” creators Matt 4 | It’s comedy night at the asylum, folks. And have we got some high-voltage vaudeville for you, the kind that curls your hair and turns your knees to rubber. So here he is, all the way from St. Petersburg, Russia, the man who put the madcap in madness. Put your hands together for the stand-up stylings of Aksentii Poprishchin. No such words of 5 | Only a performer of monumental presence can withstand the theatrical typhoon that is Mandy Patinkin . So hats off to the frail-looking, child-size marionette who walks away with “Compulsion,” the straight-line bio-drama by Rinne Groff, starring Mr. Patinkin at gale force. Designed by Matt Acheson, this charismatic puppet — with 6 | Strip away the uninspired mythology, and “I Am Number Four” is just your average high school movie with below-average drama. Fielding familiar classroom stereotypes — the bully, the science geek, the strutting alien female in the skintight cat suit — this turgid schedule filler is only marginally more fun than a week’s 7 | “Putty Hill,” Matt Porterfield ’s moody, elliptical fusion of fiction and documentary, slips back and forth between the forms with a stealth that dissolves one into the other. The mostly nonprofessional actors in the film, set in a working-class neighborhood on the outskirts of Baltimore, play versions of themselves in a fictional 8 | Icíar Bollaín’s bluntly political film “Even the Rain” makes pertinent, if heavy-handed, comparisons between European imperialism five centuries ago and modern globalization. In particular it portrays high-end filming on location in poor countries as an offshoot of colonial exploitation. The movie is set in and 9 | “We Are What We Are,” Jorge Michel Grau’s macabre fable of urban survival, follows the disintegration of a pod of people eaters when its diseased patriarch expires in a shopping mall. Swiftly removed by a wordless cleanup crew, the man’s remains are found to contain a single undigested finger. “It’s shocking how 10 | The programmers for Film Comment Selects possess the refined tastes of practiced cine-mixologists, along with a yen for the outré. For the 11th edition of this two-week annual festival, which starts on Friday at the Walter Reade Theater, they have exhumed the old and rounded up the new, unearthing treasures and curiosities to put next to 11 | Liam Neeson ’s latter-day renaissance as an unlikely action star should give hope to performers and viewers of a certain age (i.e., over 40) everywhere. While that irrepressible exhibitionist Helen Mirren , born in 1945, continues to inspire legions of AARP members, one discarded garment at a time, Mr. Neeson, a comparative pup born in 1952, 12 | Movies Ratings and running times are in parentheses; foreign films have English subtitles. Full reviews of all current releases, movie trailers, showtimes and tickets: nytimes.com/movies . ★ ‘Another Year’ (PG-13, 2:09) An autumnal gem from Mike Leigh , by turns sweet and abrasive, gentle and sad, about the unequal distribution of 13 | It’s fine to employ a plot device that’s been used repeatedly. But you run into trouble when you use a familiar plot and do only the familiar with it. “Immigration Tango,” a pale romantic comedy, has this problem. An immigrant couple (he’s from Colombia, she’s from Russia) in Florida strike a deal with their best 14 | One of the most urgent and certainly among the most beautifully shot documentaries to hit the big screen in recent memory, “The Last Lions” isn’t just another cute and fuzzy encounter session with a different species. It’s a pulse-quickening, tear-duct milking and outrageously dramatized story about the threats — 15 | Essentially a two-person play liberally sprinkled with gleaming, groovy, graphic sex, “Now & Later” exudes an amiably accessible vibe that softens the edges of its freewheeling explicitness. Lest we enjoy all this flesh without an ennobling context, the movie kicks off with Wilhelm Reich’s assertion of the link between sexual 16 | As a group they give a new and truer meaning to the phrase “independent film.” In a country where all movies must obtain official approval to be exhibited commercially, the five Chinese directors whose work will be featured beginning on Friday in the Museum of Modern Art’s Documentary Fortnight are forced to operate in a peculiar 17 | The something wicked that comes creeping like night in “Vanishing on 7th Street,” turning down the sun and seemingly sucking people right out of their homes, offices, cars and clothes, arrives without warning. One minute moviegoers are yukking it up at a multiplex in this generally nifty little horror flick, and the next minute 18 | “Loveless” is an aimless film about an aimless fellow, but it’s not without its charms. It may be without a point, but hey, you can’t have everything in a no-budget film like this. Andrew (Andrew Von Urtz) is a nearing-middle-age New Yorker who’s still behaving like a self- and sex-centered 25-year-old, trolling for 19 | It’s too bad that Paul Levesque is so, well, large. Mr. Levesque — a professional wrestler whose ring name is Triple H — is a perfectly tolerable actor, as he shows in “The Chaperone,” a lightweight comedy aimed, presumably, at tweeners and fans of World Wrestling Entertainment, whose film division generated this 20 | Around Town Museums and Sites American Museum of Natural History (Saturday and Wednesday) “Saluting Our Jazz Elders,” an afternoon of music and discussions on Saturday in celebration of Black History Month, will include performances by the percussionist Sekou Alaje (12:30 p.m.); the New Amsterdam Music Association (1:15 p.m.); the 21 | ‘CIRCUS INCOGNITUS’ Jamie Adkins, an old-style vaudeville performer, won’t mind if you throw things at him during his show. He wants you to throw. He invites you to throw. He’ll even provide the things. That Mr. Adkins behaves this way in front of children at the New Victory Theater attests to his bravery. The elementary 22 | Poised and whispery, Vanessa Paradis played her New York City debut as a headliner at Town Hall on Wednesday night. In France, Ms. Paradis has been a pop star since she was 14, when her single “Joe le Taxi” was a No. 1 hit in 1987, and where she won the Victoires de la Musique award, the equivalent of a Grammy , for album of the year 23 | LONDON — The English National Opera introduced the German director Nikolaus Lehnhoff’s production of Wagner’s “Parsifal” in 1999, and since then this influential modern staging, which presents the Knights of the Grail as a spiritually decaying brotherhood in a bleakly gray, postapocalyptic and timeless setting, has 24 | Jazz Full reviews of recent jazz concerts: nytimes.com/music . Uri Caine, Theo Bleckmann, Todd Sickafoose, Jenny Scheinman (Friday) This latest show in the weekly Spontaneous Constructions series, which aims to foster new collaborations, features Mr. Caine, a keyboardist of spectacularly diverse tastes; Mr. Bleckmann, a vocalist of ethereal 25 | Pop Prices may not include ticketing service charges. Full reviews of recent concerts: nytimes.com/music . Trey Anastasio (Tuesday) Although Phish , the jam band that made him a star to the noodle-dancing set, has since regrouped, Mr. Anastasio has yet to abandon his solo career. Last summer he released “Time Becomes Elastic” (Rubber 26 | The Tune-In festival at the Park Avenue Armory promises to explore musical connections between past and present, as well as a few philosophical notions, like whether music has the power to express anything. (Stravinsky said that it does not; others have disagreed.) Most of the series, which runs through Sunday, was assembled by the enterprising 27 | Classical Full reviews of recent music performances: nytimes.com/music . Opera ★ ‘Armida’ (Friday and Wednesday) This infrequently heard 1817 Rossini opera finally made it to the Met last spring as a vehicle for Renée Fleming in a handsome and fanciful, if rather safe, production by Mary Zimmerman . “Armida” is 28 | Len Lesser, a character actor for more than half a century whose hawklike profile and Noo Yawk accent finally gained him popular recognition when he played Jerry Seinfeld ’s annoying Uncle Leo on “Seinfeld,” died on Wednesday in Burbank Calif. He was 88. The cause was pneumonia, said his son, David, adding that his father had been 29 | There’s a holdup in the Bronx, Brooklyn’s broken out in fights. There’s a traffic jam in Harlem That’s backed up to Jackson Heights. There’s a scout troop short a child, Khrushchev’s due at Idlewild. Car 54, where are you? Ask almost anyone over 50, and the song pours buoyantly forth, evoking one of 30 | Although a two-hour “Hollywood week” episode of Fox’s “American Idol” topped the ratings on Wednesday, CBS’s new series “Criminal Minds: Suspect Behavior,” with Janeane Garofalo and Forest Whitaker , delivered a strong debut at 10 p.m. According to Nielsen’s estimates 12.9 million viewers tuned 31 | Kate Werble Gallery 83 Vandam Street SoHo Through March 12 Flokati rugs, those fluffy white coverings traditionally handmade in the Pindus Mountains in Europe and prized by contemporary designers, become wild-and-woolly wall reliefs in Anna Betbeze’s first New York solo. Ms. Betbeze dyes, scorches, shreds, shaves and otherwise attacks these 32 | Winkleman Gallery 621 West 27th Street Chelsea Through March 12 The three short, related videos that make up Janet Biggs’s debut show at Winkleman were filmed on glacial islands between the top of Norway and the North Pole. Playing on separate screens and in overlapping sequence, the pieces can be viewed in any order, though a gallery news 33 | Meredith Ward Fine Art 44 East 74th Street Manhattan Through March 12 Working in oil on small pieces of canvas board near the waters and harbors of Manhattan, John Marin (1870-1953) was possibly the first American artist to make abstract paintings. There are other candidates — among them Marsden Hartley and Georgia O’Keeffe — but 34 | Rembrandt ’s jowly, battered face glows like a night light in the great late self-portrait from 1658 at the Frick Collection . And it glows more brightly than ever now that layers of old varnish have been cleaned away. Colors — the gold of the artist’s shirt, the wine-red of his Middle Easternish sash, the pink of the chafe mark 35 | BRIDGEPORT, Conn. — “This was the most contaminated room,” Kathleen Maher said, pointing to powdery debris, paint flakes and glass shards on shelves and carpeting in a dimly lighted ground-floor gallery at her work space. She is curator and executive director at the Barnum Museum here, where last June a tornado struck its 1890s 36 | Art Museums and galleries are in Manhattan unless otherwise noted. Full reviews of recent art shows: nytimes.com/art . Museums American Folk Art Museum : ‘Eugene Von Bruenchenhein: Freelance Artist — Poet and Sculptor — Inovator — Arrow maker and Plant man — Bone artifacts constructor — Photographer and Architect 37 | Perhaps because her work so frequently appears in exhibitions, art fairs and auctions, it seems as though Cindy Sherman’s photographs are often with us. Think of images of her as a clown, a Renaissance Madonna, a sex kitten or even a half-pig, half-human creature. But in the United States it has been nearly 14 years since the public has had a 38 | Steven Harvey Fine Art Projects 24 East 73rd Street Manhattan Through Feb. 28 The companionship and inspiration that artists gain from other artists and their work is pinpointed in this sweet and unusual show. Its main focus is the friendship between Gandy Brodie (1925-1975) and Bob Thompson (1937-1966), who met in Provincetown, Mass., in the 39 | Many human beings evidently share with the magpie a gene causing an irrational attraction to bright and shiny objects. If you suffer from this disorder, you will love “Cloisonné: Chinese Enamels From the Yuan, Ming and Qing Dynasties,” a ravishing exhibition at the Bard Graduate Center. Displaying more than 160 items ranging from 40 | Leo Koenig Inc. Projekte 541 West 23rd Street Chelsea Through March 19 Vincent Szarek’s sleek, four-piece exhibition offers a poetic meditation on modern decadence. The first item, on the floor, is a long, narrow, geometric solid painted in glossy, metal-flake gold. It looks like a parody of Minimalist sculpture, but it is also readily 41 | The New Museum has become a busy place this year, and it is not yet even March. In January it opened a popular tribute to the market-hardy paintings of George Condo. Now it is offering a startlingly excellent resurrection of the prescient Post-Minimalist renegade Lynda Benglis and her gaudy, multidexterous and often gender-bending segues among 42 | Dance Full reviews of recent performances: nytimes.com/dance . Aspen Santa Fe Ballet (Tuesday through Thursday) This handsome company returns with a trio of contemporary works: Jorma Elo’s “Red Sweet,” Jiri Kylian’s “Stamping Ground” and the East Coast premiere of “Uneven,” by the Spanish 43 | Walter Dundervill is just fine with letting his imagination run wild, and that’s evident in “Aesthetic Destiny 1: Candy Mountain,” performed at Dance Theater Workshop on Wednesday night, where the terrain of the stage is decorated with colorful polygons of varying sizes. Some are propped up like jagged mountains with strangely 44 | New York’s flamencophiles are deprived of their annual Flamenco Festival this year, thanks to financial cutbacks in Spain. (The hope is to continue it biennially instead.) In its place, however, is a weeklong season of the filmmaker Carlos Saura ’s “Flamenco Hoy” (“Flamenco Today”) at City Center. Tuesday’s 45 | The British travel writer and novelist Bruce Chatwin (1940-89) had blond hair, flinty blue eyes and delicately firm features — he looked like a bookish member of the Police, Sting’s band, circa 1983 — and the kind of narcissism that can be a byproduct of talent mixed with charisma. Both men and women were drawn to him, and he to 46 | JACKSON, Miss. THERE is “The Help,” and then there is the help. And she is not happy. Ablene Cooper, a 60-year-old woman who has long worked as a maid here, has filed a lawsuit against Kathryn Stockett, the author of the best-selling novel “The Help,” about black maids working for white families in Jackson in the 1960s. In 47 | Geoffrey Rush stars in the production at the Brooklyn Academy of Music. 48 | Geoffrey Rush stars in the production at the Brooklyn Academy of Music. 49 | Geoffrey Rush stars in the production at the Brooklyn Academy of Music. 50 | Geoffrey Rush stars in the production at the Brooklyn Academy of Music. 51 | -------------------------------------------------------------------------------- /intro_web_data/classify.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | classify.py 5 | 6 | Created by Hilary Mason on 2011-02-17. 7 | Copyright (c) 2011 Hilary Mason. All rights reserved. 8 | """ 9 | 10 | import re, string 11 | 12 | from nltk import FreqDist 13 | from nltk.tokenize import word_tokenize 14 | from nltk.stem.porter import PorterStemmer 15 | 16 | class NaiveBayesClassifier(object): 17 | 18 | def __init__(self): 19 | self.feature_count = {} 20 | self.category_count = {} 21 | 22 | def probability(self, item, category): 23 | """ 24 | probability: prob that an item is in a category 25 | """ 26 | category_prob = self.get_category_count(category) / sum(self.category_count.values()) 27 | return self.document_probability(item, category) * category_prob 28 | 29 | def document_probability(self, item, category): 30 | features = self.get_features(item) 31 | 32 | p = 1 33 | for feature in features: 34 | print "%s - %s - %s" % (feature, category, self.weighted_prob(feature, category)) 35 | p *= self.weighted_prob(feature, category) 36 | 37 | return p 38 | 39 | def train_from_data(self, data): 40 | for category, documents in data.items(): 41 | for doc in documents: 42 | self.train(doc, category) 43 | 44 | # print self.feature_count 45 | 46 | 47 | # def get_features(self, document): 48 | # all_words = word_tokenize(document) 49 | # all_words_freq = FreqDist(all_words) 50 | # 51 | # # print sorted(all_words_freq.items(), key=lambda(w,c):(-c, w)) 52 | # return all_words_freq 53 | 54 | def get_features(self, document): 55 | document = re.sub('[%s]' % re.escape(string.punctuation), '', document) # removes punctuation 56 | document = document.lower() # make everything lowercase 57 | all_words = [w for w in word_tokenize(document) if len(w) > 3 and len(w) < 16] 58 | p = PorterStemmer() 59 | all_words = [p.stem(w) for w in all_words] 60 | all_words_freq = FreqDist(all_words) 61 | 62 | # print sorted(all_words_freq.items(), key=lambda(w,c):(-c, w)) 63 | return all_words_freq 64 | 65 | def increment_feature(self, feature, category): 66 | self.feature_count.setdefault(feature,{}) 67 | self.feature_count[feature].setdefault(category, 0) 68 | self.feature_count[feature][category] += 1 69 | 70 | def increment_cat(self, category): 71 | self.category_count.setdefault(category, 0) 72 | self.category_count[category] += 1 73 | 74 | def get_feature_count(self, feature, category): 75 | if feature in self.feature_count and category in self.feature_count[feature]: 76 | return float(self.feature_count[feature][category]) 77 | else: 78 | return 0.0 79 | 80 | def get_category_count(self, category): 81 | if category in self.category_count: 82 | return float(self.category_count[category]) 83 | else: 84 | return 0.0 85 | 86 | def feature_prob(self, f, category): # Pr(A|B) 87 | if self.get_category_count(category) == 0: 88 | return 0 89 | 90 | return (self.get_feature_count(f, category) / self.get_category_count(category)) 91 | 92 | def weighted_prob(self, f, category, weight=1.0, ap=0.5): 93 | basic_prob = self.feature_prob(f, category) 94 | 95 | totals = sum([self.get_feature_count(f, category) for category in self.category_count.keys()]) 96 | 97 | w_prob = ((weight*ap) + (totals * basic_prob)) / (weight + totals) 98 | return w_prob 99 | 100 | def train(self, item, category): 101 | features = self.get_features(item) 102 | 103 | for f in features: 104 | self.increment_feature(f, category) 105 | 106 | self.increment_cat(category) 107 | 108 | if __name__ == '__main__': 109 | labels = ['arts', 'sports'] # these are the categories we want 110 | data = {} 111 | for label in labels: 112 | f = open(label, 'r') 113 | data[label] = f.readlines() 114 | # print len(data[label]) 115 | f.close() 116 | 117 | nb = NaiveBayesClassifier() 118 | nb.train_from_data(data) 119 | print nb.probability("Early Friday afternoon, the lead negotiators for the N.B.A. and the players union will hold a bargaining session in Beverly Hills — the latest attempt to break a 12-month stalemate on a new labor deal.", 'arts') 120 | print nb.probability("Early Friday afternoon, the lead negotiators for the N.B.A. and the players union will hold a bargaining session in Beverly Hills — the latest attempt to break a 12-month stalemate on a new labor deal.", 'sports') 121 | 122 | 123 | -------------------------------------------------------------------------------- /intro_web_data/delicious_import.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | delicious_import.py 5 | 6 | Created by Hilary Mason on 2010-11-28. 7 | Copyright (c) 2010 Hilary Mason. All rights reserved. 8 | """ 9 | 10 | import sys 11 | import urllib 12 | import csv 13 | from xml.dom import minidom 14 | 15 | 16 | class delicious_import(object): 17 | def __init__(self, username, password=''): 18 | # API URL: https://user:passwd@api.del.icio.us/v1/posts/all 19 | url = "https://%s:%s@api.del.icio.us/v1/posts/all" % (username, password) 20 | h = urllib.urlopen(url) 21 | content = h.read() 22 | h.close() 23 | 24 | x = minidom.parseString(content) 25 | 26 | data = [] 27 | 28 | # sample post: 29 | post_list = x.getElementsByTagName('post') 30 | for post_index, post in enumerate(post_list): 31 | url = post.getAttribute('href') 32 | desc = post.getAttribute('description') 33 | tags = ",".join([t for t in post.getAttribute('tag').split()]) 34 | timestamp = post.getAttribute('time') 35 | 36 | data.append([url.encode("utf-8"), tags.encode("utf-8")]) 37 | 38 | writer = csv.writer(open("links.csv", 'wb')) 39 | for entry in data: 40 | writer.writerow(entry) 41 | 42 | 43 | if __name__ == '__main__': 44 | try: 45 | (username, password) = sys.argv[1:] 46 | except ValueError: 47 | print "Usage: python delicious_import.py username password" 48 | 49 | d = delicious_import(username, password) 50 | 51 | -------------------------------------------------------------------------------- /intro_web_data/distance_demo.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | tag_clustering.py 5 | 6 | Created by Hilary Mason on 2011-02-18. 7 | Copyright (c) 2011 Hilary Mason. All rights reserved. 8 | """ 9 | 10 | 11 | from hcluster import * 12 | 13 | class TagClustering(object): 14 | 15 | def __init__(self): 16 | v1 = [0,0,0,1] 17 | v2 = [0,1,1,1] 18 | 19 | print euclidean(v1, v2) 20 | print cityblock(v1, v2) 21 | print jaccard(v1, v2) 22 | 23 | 24 | if __name__ == '__main__': 25 | t = TagClustering() 26 | 27 | -------------------------------------------------------------------------------- /intro_web_data/nytimes_pull.py: -------------------------------------------------------------------------------- 1 | import urllib 2 | import json 3 | 4 | def main(api_key, category, label): 5 | 6 | content = [] 7 | for i in range(0,5): 8 | # print "http://api.nytimes.com/svc/search/v2/articlesearch.json?fq=news_desk:('%s')&api-key=%s&page=%s" % (category, api_key, i) 9 | h = urllib.urlopen("http://api.nytimes.com/svc/search/v2/articlesearch.json?fq=news_desk:(\"%s\")&api-key=%s&page=%s" % (category, api_key, i)) 10 | print h 11 | try: 12 | result = json.loads(h.read()) 13 | content.append(result) 14 | except ValueError: 15 | print "Malformed JSON: " + data 16 | continue #In the rare cases that JSON refuses to parse 17 | 18 | f = open(label, 'w') 19 | for line in content: 20 | try: 21 | f.write('%s\n' % line) 22 | except UnicodeEncodeError: 23 | pass 24 | 25 | f.close() 26 | 27 | if __name__ == '__main__': 28 | main("f7b4a1749764aec0364b215c354e3a0f:18:25759498", "Arts","arts") 29 | main("f7b4a1749764aec0364b215c354e3a0f:18:25759498", "Sports","sports") 30 | 31 | -------------------------------------------------------------------------------- /intro_web_data/rec.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | tag_clustering.py 5 | 6 | Created by Hilary Mason on 2011-02-18. 7 | Copyright (c) 2011 Hilary Mason. All rights reserved. 8 | """ 9 | 10 | import csv 11 | 12 | # from Pycluster import * 13 | from hcluster import * 14 | 15 | class TagClustering(object): 16 | 17 | def __init__(self): 18 | tag_data = self.load_link_data() 19 | # print tag_data 20 | all_tags = [] 21 | all_urls = [] 22 | for url,tags in tag_data.items(): 23 | all_urls.append(url) 24 | all_tags.extend(tags) 25 | 26 | all_tags = list(set(all_tags)) # list of all tags in the space 27 | 28 | numerical_data = {} # create vectors for each item 29 | for url,tags in tag_data.items(): 30 | v = [] 31 | for t in all_tags: 32 | if t in tags: 33 | v.append(1) 34 | else: 35 | v.append(0) 36 | numerical_data[url] = v 37 | 38 | recommend_url = 'http://www.qwantz.com/index.php' 39 | results = {} 40 | for url,vector in numerical_data.items(): 41 | d = euclidean(numerical_data[recommend_url],numerical_data[url]) 42 | results[url] = d 43 | 44 | print sorted(results.items(), key=lambda(u,s):(s, u)) 45 | 46 | 47 | def load_link_data(self,filename="links.csv"): 48 | data = {} 49 | 50 | r = csv.reader(open(filename, 'r')) 51 | for row in r: 52 | data[row[0]] = row[1].split(',') 53 | 54 | return data 55 | 56 | 57 | if __name__ == '__main__': 58 | t = TagClustering() 59 | 60 | -------------------------------------------------------------------------------- /intro_web_data/sports: -------------------------------------------------------------------------------- 1 | Harvey Almorn Updyke Jr., 62, of Dadeville, Ala., was arrested on a charge of criminal mischief in connection with the poisoning of the Toomer’s Corner oak trees at Auburn. On Jan. 27, a man saying he was “Al from Dadeville” phoned a radio show, claiming he poured herbicide around the 130-year-old oaks that are a scene of 2 | Mike Repole has had a pretty good winter. St. John’s, his alma mater, is inching closer to a berth in the N.C.A.A. men’s basketball tournament for the first time since 2002. His racehorse Uncle Mo is the early favorite to win the Kentucky Derby . His beloved Mets have started spring training. Repole, after all, is the latest in what has 3 | Derrick Rose scored a career-high 42 points and the host Chicago Bulls headed into the All-Star break with a 109-99 victory over the San Antonio Spurs . The Spurs have the N.B.A. ’s best record, 46-10; the Bulls, who won 41 games last season, are 38-16. Sports Briefing | Basketball 4 | DAYTONA BEACH, Fla. — The 10th anniversary of the fatal crash of Dale Earnhardt at Daytona International Speedway will pass Friday without an official tribute as a lower-level Camping World Series truck race is held at the track. But Earnhardt, a seven-time Nascar Cup champion, will be honored Sunday during the season-opening Daytona 500. On 5 | Joel Northrup refused to compete against a girl at the Iowa state tournament in Des Moines, relinquishing a chance to become a champion because he said wrestling a girl would conflict with his religious beliefs. Northrup, a home-schooled sophomore who was 35-4 wrestling for Linn-Mar High School, defaulted his first-round match in the 112-pound 6 | Tina Maze became the first Slovenian to win an Alpine skiing world championship, riding her advantage from the first run to the gold medal in the women’s giant slalom in Garmisch-Partenkirchen, Germany. She used a controlled second run to finish in 2 minutes 20.54 seconds and defeat Federica Brignone of Italy by 0.09. Sports Briefing | Skiing 7 | DAYTONA BEACH, Fla. — The Danica Patrick Nascar experiment enters its second season on Saturday at Daytona International Speedway, with expectations tempered but hopes raised. Patrick, 28, the only woman to win a race in the IndyCar Series, will compete in Nascar’s Drive4copd 300, a lower-level Nationwide Series race that runs the day 8 | DAYTONA BEACH, Fla. — Kurt Busch declared himself the favorite for Sunday’s Daytona 500, and it was hard to argue after he won the first of two 150-mile qualifying races Thursday at Daytona International Speedway. Busch is 2 for 2 at Daytona this month, having captured the exhibition Budweiser Shootout last Saturday. So far, no one has 9 | In an attempt to jump-start negotiations that stalled a week ago, representatives for N.F.L. owners and the league’s players planned to engage in seven consecutive days of talks with a federal mediator beginning Friday. The collective bargaining agreement expires in less than two weeks, and the decision to attempt to intensify negotiations 10 | Dustin Johnson wound up with another bizarre penalty Thursday when his caddie thought his tee time was 40 minutes later than it was, and he raced to the first tee at the Northern Trust Open in Los Angeles to avoid disqualification. Johnson was given a two-shot penalty for not being on the tee box at his starting time. Players then have five minutes 11 | The remaining games are starting to dwindle, and teams on the playoff bubble, like the Rangers and the Los Angeles Kings , are playing increasingly desperate hockey. That is what happened at Madison Square Garden on Thursday night, in a breathless cliffhanger not decided until Erik Christensen and Mats Zuccarello scored in the shootout and Henrik 12 | For some American hockey players at the highest level, memories of childhood are filled with idyllic days on frozen ponds and outdoor rinks. But for a growing number, childhood memories are framed by palm trees, warm weather and rooting for N.H.L. teams that many Northerners disdain as a failed Sun Belt experiment. Those memories reflect the 13 | SOUTH BEND, Ind. — Every Sunday after church, on the Hansbroughs’ backyard basketball court in Poplar Bluff, Mo., three brothers would play the age-old game of 21. The scene conjures up both Rockwell and Darwin — little Ben Hansbrough, the youngest, learned to survive despite a weekly diet of blocked shots and sharp elbows. Many 14 | Lynetta Kizer scored 17 points, and No. 16 Maryland beat No. 7 Duke , 69-47, Thursday night to drop the visiting Blue Devils into a three-way tie for first place in the Atlantic Coast Conference. The Terrapins (21-5, 7-4) let a 12-point lead dwindle to 39-38 before pulling away to a victory that enabled them to avoid their first three-game losing 15 | Peter Roby grew up playing ball with Tom Thibodeau in New Britain, Conn., and later coached with him at Harvard . His friend’s success at basketball’s highest level is no surprise — Thibodeau, Roby recalled, was always passionate about learning the game and intrigued by the challenge of teaching it. The “defensive 16 | When the Naismith Memorial Basketball Hall of Fame announces its finalists for the class of 2011 on Friday, one name will be conspicuous in its absence: that of Reggie Miller , the former Indiana Pacers sharpshooter, who is in his first year of eligibility. Miller, 45, who retired in 2005 and will be in Los Angeles this weekend as an analyst for 17 | Before they can celebrate Derrick Rose’s ascendance, Kevin Durant’s dominance or Blake Griffin’s hang time , the N.B.A. ’s brightest minds and brightest players will gather in a hotel conference room and beg one another not to ruin it all. For the next three days in downtown Los Angeles, the N.B.A. will do what it does best 18 | PORT ST. LUCIE, Fla. — While their players stretched and exercised in advance of their first official workout of the 2011 season, the owners of the Mets stood nearby on an artificial turf field and discussed the troubling issues that could jeopardize their ownership of the team. Fred Wilpon , the principal owner of the team, said Thursday 19 | PORT ST. LUCIE, Fla. — Johan Santana has been throwing a baseball for almost two weeks, but the next time he throws a pitch in a major league game could be more than four months from now, perhaps sometime close to the All-Star break. That has been the time frame the Mets expected all along for Santana, a two-time Cy Young Award winner. But in 20 | PORT ST. LUCIE, Fla. Half a dozen times during a spirited news conference that lasted about 20 minutes, Fred Wilpon took to the offense with his new favorite V word, now that his role as a victim in Bernard L. Madoff case is under grave legal challenge. In what will most likely be a costly struggle for the survival of his ownership of the Mets , he 21 | TAMPA, Fla. — Freddy Garcia dipped into his memory bank the other morning, his mind drifting to a Seattle special of an afternoon, cool and overcast, during the 2001 playoffs, when he was the 25-year-old ace of the Mariners . Garcia recalled how Bartolo Colon pumped fastball after fastball past his Seattle teammates to secure the division 22 | The Detroit Tigers ’ Miguel Cabrera was arrested on charges of drunken driving and resisting an officer in Fort Pierce, Fla. He has had drinking-related problems, including a 2009 incident in which he fought with his wife after drinking heavily the night before his team lost the A.L. Central title to Minnesota. ¶Catcher Yorvit Torrealba 23 | Joe Frazier, the manager of the Mets in the turbulent period between the tenures of Yogi Berra and Joe Torre , died Tuesday in Broken Arrow, Okla. He was 88 and a longtime Broken Arrow resident. His death was confirmed by the Christian-Gavlik Funeral Home in Broken Arrow. Frazier, who spent almost a half-century in organized baseball, primarily as 24 | PHOENIX — It was Don Mattingly ’s opening news conference at his first spring training as the Dodgers ’ manager, and a seat at the head of a picnic table was reserved for him. Mattingly demurred and folded his 6-foot frame into another chair on an outdoor patio after casually brushing off a dried bird dropping stuck to it. If only 25 | Auburn said that someone poisoned oak trees at Toomer’s Corner, where fans celebrate big wins, and that the trees, which are estimated to be more than 130 years old, could not be saved. Auburn said a herbicide commonly used to kill trees was applied “in lethal amounts” to the soil. A caller to a radio show claimed he had applied 26 | It’s not every day that Max Klimavicius, the owner of Sardi’s, personally cuts a customer’s filet mignon into bite-size pieces. Then again, not every customer has just won Best in Show at the Westminster Kennel Club Dog Show . The chef had cooked the steak until it was medium rare, lightly seasoned it with salt and pepper, then 27 | Camille Richardson has heard all the arguments, read all the comments, and sees the logic. But as a freshman midfielder for the Columbia women’s lacrosse team who is fully aware of the dangers of head trauma, Richardson makes one thing clear: She has no interest in wearing a helmet, as the men must. “Wearing a helmet,” Richardson 28 | The Boston Athletic Association announced new registration procedures for the Boston Marathon in response to the growing demand that will leave some of the fastest runners on the sideline. The field of nearly 27,000 for this year filled up in eight hours. Organizers said the top qualifiers would be allowed to enter first under a two-week, online, 29 | The DVD was sitting in Michael Waltrip ’s house for nine and a half years while the accident churned inside him. His big sister Connie had recorded every race of his. As soon as the Daytona 500 went off the air that fatal day , she decorated the case with stars and happy faces to commemorate his first Daytona — his first Nascar Cup 30 | DAYTONA BEACH, Fla. — Dale Earnhardt Jr. crashed during practice at Daytona International Speedway on Wednesday, costing him the pole position for the Daytona 500 on Sunday. Earnhardt captured the pole last Sunday, but he will now have to switch to a backup car. Under Nascar rules, that means he will have to start from the back of the field. 31 | DAYTONA BEACH, Fla. — A year after potholes led to embarrassing delays in the Daytona 500, a $20 million repaving job at Daytona International Speedway is helping to create another set of concerns for Nascar ’s season-opening showcase event. Nascar officials mandated a series of adjustments to the racecars this week, the latest coming 32 | Lance Armstrong , the seven-time Tour de France winner who is the target of a federal investigation into doping in cycling, announced Wednesday that he had retired from his sport — this time for good. Armstrong, who is 39 and a cancer survivor, said he was leaving to spend more time with his family — he has five children — and to 33 | Arsenal stunned Barcelona with a second-half comeback in the European Champions League at Emirate Stadium on Wednesday in London, with the substitute Andrey Arshavin scoring the winner in the 83rd minute in a 2-1 victory. The Gunners were outplayed for most of the first leg of the Round of 16 meeting. The second leg is scheduled to be played on 34 | The Turkish Basketball Federation lifted the provisional suspension of Diana Taurasi on Wednesday after the lab that conducted Taurasi’s positive test retracted its report. The lab issued the change after it evaluated Taurasi’s statements in her defense. The federation did not say whether the lab made a mistake. Taurasi, 28, who had her 35 | Anna Chakvetadze collapsed on the court as she was serving for the second set against top-seeded Caroline Wozniacki at the Dubai Championships in the United Arab Emirates and had to withdraw. After losing the first set, 6-1, Chakvetadze was ahead, 5-3, when she lost a long rally to Wozniacki, wobbled and fainted. After treatment, Chakvetadze 36 | LOS ANGELES — So what if the weather on the horizon is as forbidding as the maître d’ at Koi glaring at the common people? The specter of three days of rain in Southern California has not deterred the field at Riviera Country Club for the Northern Trust Open. At least, not yet. With 11 of the top 20 players in the World Golf 37 | The Devils rookie Nick Palmieri gave the puck to Ilya Kovalchuk along the Carolina goal line early in the second period of a scoreless game, then got a chance to watch Kovalchuk, a $100 million Russian superstar, put on a show. Kovalchuk skated to the point, reversed direction, went back down to the left circle, hit the brakes to lose defenseman 38 | Kemba Walker had the ball at about the free-throw line and he was being covered by a player 9 inches taller than him. He had quite a list of possible plays in front of him. The one he chose is not on the list of options for almost every other player. Walker faked to his left, then threw the ball hard off the backboard and — since he was the 39 | Looking nothing like the two-time defending N.B.A. champions, the Los Angeles Lakers dropped their third straight game, a 104-99 loss Wednesday night on the road to the Cleveland Cavaliers — the league’s worst team, which avenged a 55-point embarrassment against Los Angeles last month. Ramon Sessions came off the bench and scored a 40 | The Knicks settled into the All-Star break on Wednesday at nearly the same point where they started the season. Only now they have two more wins than losses. For the Knicks, any progress is significant toward ending a six-year playoff absence. After a 102-90 victory over the Atlanta Hawks at Madison Square Garden, the Knicks (28-26) squeezed into 41 | TAMPA, Fla. — Joba Chamberlain arrived at the most important spring training of his young career listed at 230 pounds , just as he was all last season. This would not be a problem except that Chamberlain weighs more than 230 pounds, and the Yankees are hardly pleased that he does. Asked Wednesday morning for his impression of Chamberlain, 42 | In late 1999, three friends created a Web site to solicit fans to acquire the Jets . The quixotic effort at one point claimed commitments worth $20 million from 11,000 people. But the N.F.L. rejected the plan, and Woody Johnson paid $635 million for the team in early 2000. On Wednesday, three friends started an Internet bid to acquire the Mets and 43 | PORT ST. LUCIE, Fla. — Jeff Wilpon, the Mets ’ chief operating officer, spoke to reporters Wednesday for the first time since a lawsuit seeking as much as $1 billion from the team’s owners was unsealed on Feb. 4. Wilpon, the son of the longtime principal owner of the Mets, Fred Wilpon , said his family had received many offers to 44 | PORT ST. LUCIE, Fla. — A contrite Francisco Rodriguez arrived in Mets camp Wednesday and promised that he had learned from his mistakes and had become a better person. At the same time, Rodriguez vowed that in some respects he would not change. “On the mound, it’s going to be the same,” he said. “It’s going to be 45 | CLEARWATER, Fla. He stood beneath a palm tree on a clear Florida morning, commanding attention as he always has across a lifetime in baseball. Yet the routines of spring training were gone for Dallas Green. There is nothing routine about coping with horror. Christina-Taylor Green was the youngest victim of the shooting in Tucson last month that 46 | JUPITER, Fla. — In explaining how the St. Louis Cardinals have reached the end of negotiations to extend the contract of Albert Pujols, the team’s owner, Bill DeWitt Jr., really did not have to utter much more than one short declarative sentence. “We’re not the Yankees ,” he said after Pujols’s self-imposed noon 47 | A day before a scheduled arbitration hearing, the Brewers and second baseman Rickie Weeks agreed to a four-year, $38.5 million deal. A 2015 option could increase the total value to $50 million. Weeks, 28, hit .269 with 29 homers, 83 runs batted in and 112 runs last year. Sports Briefing | Baseball 48 | The day in sports, including cricket, skiing, and pitchers and catchers. 49 | The day in sports, including cricket, skiing, and pitchers and catchers. 50 | The day in sports, including cricket, skiing, and pitchers and catchers. 51 | -------------------------------------------------------------------------------- /intro_web_data/stopwords.txt: -------------------------------------------------------------------------------- 1 | i 2 | me 3 | my 4 | myself 5 | we 6 | our 7 | ours 8 | ourselves 9 | you 10 | your 11 | yours 12 | yourself 13 | yourselves 14 | he 15 | him 16 | his 17 | himself 18 | she 19 | her 20 | hers 21 | herself 22 | it 23 | its 24 | itself 25 | they 26 | them 27 | their 28 | theirs 29 | themselves 30 | what 31 | which 32 | who 33 | whom 34 | this 35 | that 36 | these 37 | those 38 | am 39 | is 40 | are 41 | was 42 | were 43 | be 44 | been 45 | being 46 | have 47 | has 48 | had 49 | having 50 | do 51 | does 52 | did 53 | doing 54 | a 55 | an 56 | the 57 | and 58 | but 59 | if 60 | or 61 | because 62 | as 63 | until 64 | while 65 | of 66 | at 67 | by 68 | for 69 | with 70 | about 71 | against 72 | between 73 | into 74 | through 75 | during 76 | before 77 | after 78 | above 79 | below 80 | to 81 | from 82 | up 83 | down 84 | in 85 | out 86 | on 87 | off 88 | over 89 | under 90 | again 91 | further 92 | then 93 | once 94 | here 95 | there 96 | when 97 | where 98 | why 99 | how 100 | all 101 | any 102 | both 103 | each 104 | few 105 | more 106 | most 107 | other 108 | some 109 | such 110 | no 111 | nor 112 | not 113 | only 114 | own 115 | same 116 | so 117 | than 118 | too 119 | very 120 | s 121 | t 122 | can 123 | will 124 | just 125 | don 126 | should 127 | now 128 | the 129 | and 130 | this 131 | with 132 | for 133 | not 134 | but 135 | with 136 | how -------------------------------------------------------------------------------- /intro_web_data/tag_clustering.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | tag_clustering.py 5 | 6 | Created by Hilary Mason on 2011-02-18. 7 | Copyright (c) 2011 Hilary Mason. All rights reserved. 8 | """ 9 | 10 | import csv 11 | 12 | import numpy 13 | from Pycluster import * 14 | 15 | class TagClustering(object): 16 | 17 | def __init__(self): 18 | tag_data = self.load_link_data() 19 | # print tag_data 20 | all_tags = [] 21 | all_urls = [] 22 | for url,tags in tag_data.items(): 23 | all_urls.append(url) 24 | all_tags.extend(tags) 25 | 26 | all_tags = list(set(all_tags)) # list of all tags in the space 27 | 28 | numerical_data = [] # create vectors for each item 29 | for url,tags in tag_data.items(): 30 | v = [] 31 | for t in all_tags: 32 | if t in tags: 33 | v.append(1) 34 | else: 35 | v.append(0) 36 | numerical_data.append(tuple(v)) 37 | data = numpy.array(numerical_data) 38 | 39 | # cluster the items 40 | # labels, error, nfound = kcluster(data, nclusters=20, dist='e') # 20 clusters, euclidean distance 41 | # labels, error, nfound = kcluster(data, nclusters=20, dist='b',npass=10) # 20 clusters, city-block distance, iterate 10 times 42 | labels, error, nfound = kcluster(data, nclusters=30, dist='a',npass=10) # 30 clusters, abs val of the correlation distance, iterate 10 times 43 | 44 | # print out the clusters 45 | clustered_urls = {} 46 | clustered_tags = {} 47 | i = 0 48 | for url in all_urls: 49 | clustered_urls.setdefault(labels[i], []).append(url) 50 | clustered_tags.setdefault(labels[i], []).extend(tag_data[url]) 51 | i += 1 52 | 53 | for cluster_id,urls in clustered_urls.items(): 54 | print cluster_id 55 | print urls 56 | 57 | # for cluster_id,tags in clustered_tags.items(): 58 | # print cluster_id 59 | # print list(set(tags)) 60 | 61 | 62 | def load_link_data(self,filename="links.csv"): 63 | data = {} 64 | 65 | r = csv.reader(open(filename, 'r')) 66 | for row in r: 67 | data[row[0]] = row[1].split(',') 68 | 69 | return data 70 | 71 | 72 | if __name__ == '__main__': 73 | t = TagClustering() 74 | 75 | -------------------------------------------------------------------------------- /solving_problems/bloom_filter.py: -------------------------------------------------------------------------------- 1 | from hashes.bloom import bloomfilter 2 | 3 | hash1 = bloomfilter('imastring') 4 | print hash1.hashbits, hash1.num_hashes # default values (see below) 5 | 6 | hash1.add('imastring string') 7 | 8 | # print 'test string' in hash1 9 | for word in 'bloom filters are the best'.split(): 10 | hash1.add(word) 11 | 12 | if 'machine' in hash1: 13 | print "machine!" 14 | -------------------------------------------------------------------------------- /solving_problems/decision_tree_regression.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | # Create a random dataset 4 | rng = np.random.RandomState(1) 5 | X = np.sort(5 * rng.rand(80, 1), axis=0) 6 | y = np.sin(X).ravel() 7 | y[::5] += 3 * (0.5 - rng.rand(16)) 8 | 9 | # Fit regression model 10 | from sklearn.tree import DecisionTreeRegressor 11 | 12 | clf_1 = DecisionTreeRegressor(max_depth=2) 13 | clf_2 = DecisionTreeRegressor(max_depth=5) 14 | clf_1.fit(X, y) 15 | clf_2.fit(X, y) 16 | 17 | # Predict 18 | X_test = np.arange(0.0, 5.0, 0.01)[:, np.newaxis] 19 | y_1 = clf_1.predict(X_test) 20 | y_2 = clf_2.predict(X_test) 21 | 22 | # Plot the results 23 | import pylab as pl 24 | 25 | pl.figure() 26 | pl.scatter(X, y, c="k", label="data") 27 | pl.plot(X_test, y_1, c="g", label="max_depth=2", linewidth=2) 28 | pl.plot(X_test, y_2, c="r", label="max_depth=5", linewidth=2) 29 | pl.xlabel("data") 30 | pl.ylabel("target") 31 | pl.title("Decision Tree Regression") 32 | pl.legend() 33 | #pl.show() 34 | pl.savefig('decision_tree_regression.png', format='png') 35 | -------------------------------------------------------------------------------- /solving_problems/flat.txt: -------------------------------------------------------------------------------- 1 | Flat Glasses 2 | Flatpack Bunny 3 | Flatpack Monkey 4 | Flat Pack Fastenerless (FPF) Game Table With Reversible Top 5 | Axim X51v flatpack cradle 6 | Flatfile Nameplate 7 | Print Flat - Roll Into 3D, Heptagonal Column 8 | FlatRoll Airfoil 9 | FlatRoll with Adjustable Thickness, Pentagonal Column 10 | Lock-Tab FlatRoll, Hexagonal Column 11 | Flatpack Sphere 12 | Interlocking Puzzle Piece Flat 13 | Calibration -flat- square 14 | Flat decorative Christmas tree 15 | Flat Bottom Shotglass 16 | 3D from any 2D (or From Flat to Cat) 17 | Urinal with flat bottom 18 | Flat drivenut for 12x6x2 (mm) trapezium thread 19 | Iris Box V2 Flat Base 20 | Stepper motor gear for Reprap Mendel with flat on motor shaft 21 | aMESS RAMPS Flattened Enclosure v.0.1.2 - Arduino Modular Enclosure System Stack 22 | Supa-Flat X-Carriage 23 | Flat Yodsta/Gangda 24 | Parametric inflater nozzle 25 | Flatfooted Soldier Boy 26 | Flat Teardrop 27 | -------------------------------------------------------------------------------- /solving_problems/kmeans_descriptions.py: -------------------------------------------------------------------------------- 1 | import csv 2 | from sklearn.datasets import fetch_20newsgroups 3 | from sklearn.feature_extraction.text import Vectorizer 4 | from sklearn import metrics 5 | 6 | from sklearn.cluster import KMeans, MiniBatchKMeans 7 | 8 | import logging 9 | from optparse import OptionParser 10 | import sys 11 | from time import time 12 | 13 | import numpy as np 14 | 15 | 16 | # Display progress logs on stdout 17 | logging.basicConfig(level=logging.INFO, 18 | format='%(asctime)s %(levelname)s %(message)s') 19 | 20 | # parse commandline arguments 21 | op = OptionParser() 22 | op.add_option("--no-minibatch", 23 | action="store_false", dest="minibatch", default=True, 24 | help="Use ordinary k-means algorithm.") 25 | 26 | print __doc__ 27 | op.print_help() 28 | 29 | (opts, args) = op.parse_args() 30 | if len(args) > 0: 31 | op.error("this script takes no arguments.") 32 | sys.exit(1) 33 | 34 | 35 | input_data = csv.reader(open('descriptions_100.csv','rb')) 36 | dataset_data = [] 37 | dataset_target = [] 38 | for row in input_data: 39 | dataset_data.append(row[1]) 40 | dataset_target.append(row[0]) 41 | 42 | labels = dataset_target 43 | true_k = np.unique(labels).shape[0] 44 | 45 | print "Extracting features from the training dataset using a sparse vectorizer" 46 | t0 = time() 47 | vectorizer = Vectorizer(max_df=0.95, max_features=10000) 48 | X = vectorizer.fit_transform(dataset_data) 49 | print X 50 | 51 | print "done in %fs" % (time() - t0) 52 | print "n_samples: %d, n_features: %d" % X.shape 53 | 54 | 55 | ############################################################################### 56 | # Do the actual clustering 57 | 58 | km = MiniBatchKMeans(k=true_k, init='k-means++', n_init=1,init_size=1000,batch_size=1000, verbose=1) 59 | 60 | print "Clustering with %s" % km 61 | t0 = time() 62 | km.fit(X) 63 | print "done in %0.3fs\n" % (time() - t0) 64 | print km.labels_ 65 | 66 | # print "Homogeneity: %0.3f" % metrics.homogeneity_score(labels, km.labels_) 67 | # print "Completeness: %0.3f" % metrics.completeness_score(labels, km.labels_) 68 | # print "V-measure: %0.3f" % metrics.v_measure_score(labels, km.labels_) 69 | 70 | 71 | -------------------------------------------------------------------------------- /solving_problems/liked_decision_tree.py: -------------------------------------------------------------------------------- 1 | import sys, os 2 | import csv 3 | 4 | from sklearn import tree 5 | 6 | if __name__ == '__main__': 7 | input_file = "thingiverse_liked_objects_1k.csv" 8 | input_data = csv.reader(open(input_file, 'rb')) 9 | 10 | data_features = [] 11 | data_labels = [] 12 | 13 | for row in input_data: 14 | data_features.append([row[0], row[1]]) 15 | data_labels.append(row[2]) 16 | 17 | # sklearn.tree.DecisionTreeClassifier(criterion='gini', max_depth=None, min_split=1, 18 | # min_density=0.10000000000000001, max_features=None, compute_importances=False, random_state=None) 19 | 20 | dt = tree.DecisionTreeClassifier(min_split=10) 21 | dt = dt.fit(data_features, data_labels) 22 | 23 | print dt.predict([50,500]) 24 | 25 | o = tree.export_graphviz(dt,out_file='thingiverse_tree.dot',feature_names=['user_id','num_likes']) 26 | -------------------------------------------------------------------------------- /solving_problems/map_reduce.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | # 3 | # Licensed to Cloudera, Inc. under one 4 | # or more contributor license agreements. See the NOTICE file 5 | # distributed with this work for additional information 6 | # regarding copyright ownership. Cloudera, Inc. licenses this file 7 | # to you under the Apache License, Version 2.0 (the 8 | # "License"); you may not use this file except in compliance 9 | # with the License. You may obtain a copy of the License at 10 | # 11 | # http://www.apache.org/licenses/LICENSE-2.0 12 | # 13 | # Unless required by applicable law or agreed to in writing, software 14 | # distributed under the License is distributed on an "AS IS" BASIS, 15 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 16 | # See the License for the specific language governing permissions and 17 | # limitations under the License. 18 | # 19 | # 20 | # Template for python Hadoop streaming. Fill in the map() and reduce() 21 | # functions, which should call emit(), as appropriate. 22 | # 23 | # Test your script with 24 | # cat input | python wordcount.py map | sort | python wordcount.py reduce 25 | 26 | import sys 27 | import re 28 | try: 29 | import simplejson as json 30 | except ImportError: 31 | import json 32 | 33 | import __builtin__ 34 | 35 | def map(line): 36 | words = line.split() 37 | for word in words: 38 | emit(word, str(1)) 39 | 40 | def reduce(key, values): 41 | emit(key, str(sum(__builtin__.map(int,values)))) 42 | 43 | # Common library code follows: 44 | 45 | def emit(key, value): 46 | """ 47 | Emits a key->value pair. Key and value should be strings. 48 | """ 49 | try: 50 | print "\t".join( (key, value) ) 51 | except: 52 | pass 53 | 54 | def run_map(): 55 | """Calls map() for each input value.""" 56 | for line in sys.stdin: 57 | line = line.rstrip() 58 | map(line) 59 | 60 | def run_reduce(): 61 | """Gathers reduce() data in memory, and calls reduce().""" 62 | prev_key = None 63 | values = [] 64 | for line in sys.stdin: 65 | line = line.rstrip() 66 | key, value = re.split("\t", line, 1) 67 | if prev_key == key: 68 | values.append(value) 69 | else: 70 | if prev_key is not None: 71 | reduce(prev_key, values) 72 | prev_key = key 73 | values = [ value ] 74 | 75 | if prev_key is not None: 76 | reduce(prev_key, values) 77 | 78 | def main(): 79 | """Runs map or reduce code, per arguments.""" 80 | if len(sys.argv) != 2 or sys.argv[1] not in ("map", "reduce"): 81 | print "Usage: %s " % sys.argv[0] 82 | sys.exit(1) 83 | if sys.argv[1] == "map": 84 | run_map() 85 | elif sys.argv[1] == "reduce": 86 | run_reduce() 87 | else: 88 | assert False 89 | 90 | if __name__ == "__main__": 91 | main() 92 | -------------------------------------------------------------------------------- /solving_problems/path_distribution.txt: -------------------------------------------------------------------------------- 1 | 132 /autoscript/autoscript.css 2 | 130 /autoscript/makescript.php 3 | 130 /autoscript/ 4 | 115 /favicon.ico 5 | 66 /autoscript/copy.js 6 | 49 / 7 | 22 /wp/feed/ 8 | 20 /robots.txt 9 | 11 /autoscript/index_fr.html 10 | 9 /autoscript/images/autoscript_008.jpg 11 | 9 /autoscript/images/autoscript_007.jpg 12 | 9 /autoscript/images/autoscript_004.jpg 13 | 9 /autoscript/images/autoscript_003.jpg 14 | 9 /autoscript/images/autoscript_001.jpg 15 | 9 /autoscript/howto.html 16 | 8 /autoscript/Script%20Me!.files/autoscript.css 17 | 7 /wp/feed/atom/ 18 | 7 /autoscript/makescript_fr.php 19 | 7 /autoscript/index_de.html 20 | 6 /wp/education/an-experience-with-using-a-wiki-for-a-collaborative-classroom-documentation-project/feed/ 21 | 6 /autoscript/stats/ 22 | 5 /autoscript/makescript_de.php 23 | 4 /wp/?feed=rss2 24 | 4 /autoscript 25 | 3 /wp/articles/what-do-those-php-errors-really-mean-anyway/ 26 | 2 /wp/ 27 | 2 /autoscript/index_jp.html 28 | 1 /wp/wp-login.php 29 | 1 /wp/wp-content/uploads/2006/ 30 | 1 /wp/virtual-worlds/miniego-is-exactly-that/trackback/ 31 | 1 /wp/questions/should-i-pay-for-a-search-engine-submission-service 32 | 1 /wp/ephemera/offline-web-applications-coming-soon/ 33 | 1 /wp//wp-login.php 34 | 1 /wordpress//wp-login.php 35 | 1 /twit.php 36 | 1 /sitemap.xml 37 | 1 /sitemap 38 | 1 /scrit 39 | 1 /labs/twitter/setup.html 40 | 1 /labs/twitter/graphviz-2.16.1/windows/plugin/lib/Release/ 41 | 1 /labs/twitter/graphviz-2.16.1/windows/lib/gd/gd.dsp 42 | 1 /labs/twitter/graphviz-2.16.1/windows/lib/fdpgen/makefile 43 | 1 /labs/twitter/graphviz-2.16.1/windows/cmd/bin/ 44 | 1 /labs/twitter/graphviz-2.16.1/tclpkg/tclpathplan/demo/pathplan_data/Makefile.in 45 | 1 /labs/twitter/graphviz-2.16.1/tclpkg/tclpathplan/demo/Makefile.am 46 | 1 /labs/twitter/graphviz-2.16.1/tclpkg/tclpathplan/demo/ 47 | 1 /labs/twitter/graphviz-2.16.1/tclpkg/gv/demo/ 48 | 1 /labs/twitter/graphviz-2.16.1/rtest/Makefile 49 | 1 /labs/twitter/graphviz-2.16.1/libltdl/Makefile 50 | 1 /labs/twitter/graphviz-2.16.1/lib/twopigen/Makefile 51 | 1 /labs/twitter/graphviz-2.16.1/lib/pathplan/libpathplan.pc 52 | 1 /labs/twitter/graphviz-2.16.1/lib/neatogen/constrained_majorization_ipsep.c 53 | 1 /labs/twitter/graphviz-2.16.1/lib/graph/y.tab.h 54 | 1 /labs/twitter/graphviz-2.16.1/lib/graph/parser.h 55 | 1 /labs/twitter/graphviz-2.16.1/lib/graph/libgraph.h 56 | 1 /labs/twitter/graphviz-2.16.1/lib/expr/exparse.y 57 | 1 /labs/twitter/graphviz-2.16.1/lib/common/y.tab.c 58 | 1 /labs/twitter/graphviz-2.16.1/lib/common/shapes.c 59 | 1 /labs/twitter/graphviz-2.16.1/lib/common/ps_fontmap.txt 60 | 1 /labs/twitter/graphviz-2.16.1/lib/common/ns.c 61 | 1 /labs/twitter/graphviz-2.16.1/lib/common/htmlparse.y 62 | 1 /labs/twitter/graphviz-2.16.1/lib/common/geom.c 63 | 1 /labs/twitter/graphviz-2.16.1/lib/cgraph/grammar.y 64 | 1 /labs/twitter/graphviz-2.16.1/lib/cgraph/Makefile.in 65 | 1 /labs/twitter/graphviz-2.16.1/lib/agraph/write.c 66 | 1 /labs/twitter/graphviz-2.16.1/graphs/directed/Makefile 67 | 1 /labs/twitter/graphviz-2.16.1/graphs/Makefile 68 | 1 /labs/twitter/graphviz-2.16.1/features/ 69 | 1 /labs/twitter/graphviz-2.16.1/dot.demo/gv_test.py 70 | 1 /labs/twitter/graphviz-2.16.1/cmd/tools/Makefile.old 71 | 1 /labs/twitter/graphviz-2.16.1/cmd/lefty/examples/ 72 | 1 /labs/twitter/graphviz-2.16.1/cmd/dotty/mswin32/dotty.c 73 | 1 /labs/twitter/graphviz-2.16.1/cmd/dotty/mswin32/ 74 | 1 /labs/twitter/graphviz-2.16.1/cmd/dotty/Makefile 75 | 1 /labs/twitter/graphviz-2.16.1/builddate.h 76 | 1 /labs/twitter/graphviz-2.16.1/ 77 | 1 /crossdomain.xml 78 | 1 /blog//wp-login.php 79 | 1 //wp-login.php 80 | -------------------------------------------------------------------------------- /solving_problems/pca.py: -------------------------------------------------------------------------------- 1 | import csv 2 | import pylab as pl 3 | 4 | from sklearn import datasets 5 | from sklearn.decomposition import PCA 6 | from sklearn.lda import LDA 7 | 8 | input_file = "thingiverse_liked_objects_1k.csv" 9 | input_data = csv.reader(open(input_file, 'rb')) 10 | 11 | data_features = [] 12 | data_labels = [] 13 | 14 | for row in input_data: 15 | data_features.append([float(row[0]), float(row[1])]) 16 | data_labels.append(row[2]) 17 | 18 | X = data_features 19 | target_names = data_labels 20 | y = data_labels 21 | # print X 22 | 23 | pca = PCA(n_components=2) 24 | X_r = pca.fit(X).transform(X) 25 | # print X_r.tolist() 26 | 27 | # Percentage of variance explained for each components 28 | # print 'explained variance ratio (first two components): %s' % pca.explained_variance_ratio_ 29 | -------------------------------------------------------------------------------- /solving_problems/scripts/bar_chart.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Copyright 2010 bit.ly 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); you may 6 | # not use this file except in compliance with the License. You may obtain 7 | # a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 13 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 14 | # License for the specific language governing permissions and limitations 15 | # under the License. 16 | 17 | """ 18 | Generate an ascii bar chart for input data 19 | 20 | http://github.com/bitly/data_hacks 21 | """ 22 | import sys 23 | import math 24 | from collections import defaultdict 25 | from optparse import OptionParser 26 | 27 | def load_stream(input_stream): 28 | for line in input_stream: 29 | clean_line = line.strip() 30 | if not clean_line: 31 | # skip empty lines (ie: newlines) 32 | continue 33 | if clean_line[0] in ['"', "'"]: 34 | clean_line = clean_line.strip('"').strip("'") 35 | if clean_line: 36 | yield clean_line 37 | 38 | def run(input_stream, options): 39 | data = defaultdict(lambda:0) 40 | for row in input_stream: 41 | data[row]+=1 42 | 43 | if not data: 44 | print "Error: no data" 45 | sys.exit(1) 46 | 47 | max_length = max([len(key) for key in data.keys()]) 48 | max_length = min(max_length, 50) 49 | value_characters = 80 - max_length 50 | max_value = max(data.values()) 51 | scale = int(math.ceil(float(max_value) / value_characters)) 52 | scale = max(1, scale) 53 | 54 | print "# each * represents a count of %d" % scale 55 | 56 | if options.sort_values: 57 | # sort by values 58 | data = [[value,key] for key,value in data.items()] 59 | if options.reverse_sort: 60 | data.sort(reverse=True) 61 | else: 62 | data.sort() 63 | else: 64 | data = [[key,value] for key,value in data.items()] 65 | data.sort(reverse=options.reverse_sort) 66 | data = [[value, key] for key,value in data] 67 | format = "%" + str(max_length) + "s [%6d] %s" 68 | for value,key in data: 69 | print format % (key[:max_length], value, (value / scale) * "*") 70 | 71 | if __name__ == "__main__": 72 | parser = OptionParser() 73 | parser.usage = "cat data | %prog [options]" 74 | parser.add_option("-k", "--sort-keys", dest="sort_keys", default=True, action="store_true", 75 | help="sort by the key [default]") 76 | parser.add_option("-v", "--sort-values", dest="sort_values", default=False, action="store_true", 77 | help="sort by the frequence") 78 | parser.add_option("-r", "--reverse-sort", dest="reverse_sort", default=False, action="store_true", 79 | help="reverse the sort") 80 | 81 | (options, args) = parser.parse_args() 82 | 83 | if sys.stdin.isatty(): 84 | parser.print_usage() 85 | print "for more help use --help" 86 | sys.exit(1) 87 | run(load_stream(sys.stdin), options) 88 | 89 | -------------------------------------------------------------------------------- /solving_problems/scripts/histogram.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Copyright 2010 bit.ly 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); you may 6 | # not use this file except in compliance with the License. You may obtain 7 | # a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 13 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 14 | # License for the specific language governing permissions and limitations 15 | # under the License. 16 | 17 | """ 18 | Generate a text format histogram 19 | 20 | This is a loose port to python of the Perl version at 21 | http://www.pandamatak.com/people/anand/xfer/histo 22 | 23 | http://github.com/bitly/data_hacks 24 | """ 25 | 26 | import sys 27 | from decimal import Decimal 28 | import math 29 | from optparse import OptionParser 30 | 31 | class MVSD(object): 32 | """ A class that calculates a running Mean / Variance / Standard Deviation""" 33 | def __init__(self): 34 | self.is_started = False 35 | self.ss = Decimal(0) # (running) sum of square deviations from mean 36 | self.m = Decimal(0) # (running) mean 37 | self.total_w = Decimal(0) # weight of items seen 38 | 39 | def add(self, x, w=1): 40 | """ add another datapoint to the Mean / Variance / Standard Deviation""" 41 | if not isinstance(x, Decimal): 42 | x = Decimal(x) 43 | if not self.is_started: 44 | self.m = x 45 | self.ss = Decimal(0) 46 | self.total_w = w 47 | self.is_started = True 48 | else: 49 | temp_w = self.total_w + w 50 | self.ss += (self.total_w * w * (x - self.m) * (x - self.m )) / temp_w 51 | self.m += (x - self.m) / temp_w 52 | self.total_w = temp_w 53 | 54 | # print "added %-2d mean=%0.2f var=%0.2f std=%0.2f" % (x, self.mean(), self.var(), self.sd()) 55 | 56 | def var(self): 57 | return self.ss / self.total_w 58 | 59 | def sd(self): 60 | return math.sqrt(self.var()) 61 | 62 | def mean(self): 63 | return self.m 64 | 65 | def test_mvsd(): 66 | mvsd = MVSD() 67 | for x in range(10): 68 | mvsd.add(x) 69 | 70 | assert '%.2f' % mvsd.mean() == "4.50" 71 | assert '%.2f' % mvsd.var() == "8.25" 72 | assert '%.14f' % mvsd.sd() == "2.87228132326901" 73 | 74 | def load_stream(input_stream): 75 | for line in input_stream: 76 | clean_line = line.strip() 77 | if not clean_line: 78 | # skip empty lines (ie: newlines) 79 | continue 80 | if clean_line[0] in ['"', "'"]: 81 | clean_line = clean_line.strip('"').strip("'") 82 | try: 83 | yield Decimal(clean_line) 84 | except: 85 | print >>sys.stderr, "invalid line %r" % line 86 | 87 | def median(values): 88 | length = len(values) 89 | if length%2: 90 | median_indeces = [length/2] 91 | else: 92 | median_indeces = [length/2-1, length/2] 93 | 94 | values = sorted(values) 95 | return sum([values[i] for i in median_indeces]) / len(median_indeces) 96 | 97 | def test_median(): 98 | assert 6 == median([8,7,9,1,2,6,3]) # odd-sized list 99 | assert 4 == median([4,5,2,1,9,10]) # even-sized int list. (4+5)/2 = 4 100 | assert "4.50" == "%.2f" % median([4.0,5,2,1,9,10]) #even-sized float list. (4.0+5)/2 = 4.5 101 | 102 | 103 | def histogram(stream, options): 104 | """ 105 | Loop over the stream and add each entry to the dataset, printing out at the end 106 | 107 | stream yields Decimal() 108 | """ 109 | if not options.min or not options.max: 110 | # glob the iterator here so we can do min/max on it 111 | data = list(stream) 112 | else: 113 | data = stream 114 | bucket_scale = 1 115 | 116 | if options.min: 117 | min_v = Decimal(options.min) 118 | else: 119 | min_v = min(data) 120 | if options.max: 121 | max_v = Decimal(options.max) 122 | else: 123 | max_v = max(data) 124 | buckets = options.buckets and int(options.buckets) or 10 125 | if buckets <= 0: 126 | raise ValueError('# of buckets must be > 0') 127 | if not max_v > min_v: 128 | raise ValueError('max must be > min. max:%s min:%s' % (max_v, min_v)) 129 | 130 | diff = max_v - min_v 131 | step = diff / buckets 132 | bucket_counts = [0 for x in range(buckets)] 133 | boundaries = [] 134 | for x in range(buckets): 135 | boundaries.append(min_v + (step * (x + 1))) 136 | 137 | skipped = 0 138 | samples = 0 139 | mvsd = MVSD() 140 | accepted_data = [] 141 | for value in data: 142 | samples +=1 143 | if options.mvsd: 144 | mvsd.add(value) 145 | accepted_data.append(value) 146 | # find the bucket this goes in 147 | if value < min_v or value > max_v: 148 | skipped +=1 149 | continue 150 | for bucket_postion, boundary in enumerate(boundaries): 151 | if value <= boundary: 152 | bucket_counts[bucket_postion] +=1 153 | break 154 | 155 | # auto-pick the hash scale 156 | if max(bucket_counts) > 75: 157 | bucket_scale = int(max(bucket_counts) / 75) 158 | 159 | print "# NumSamples = %d; Min = %0.2f; Max = %0.2f" % (samples, min_v, max_v) 160 | if skipped: 161 | print "# %d value%s outside of min/max" % (skipped, skipped > 1 and 's' or '') 162 | if options.mvsd: 163 | print "# Mean = %f; Variance = %f; SD = %f; Median %f" % (mvsd.mean(), mvsd.var(), mvsd.sd(), median(accepted_data)) 164 | print "# each * represents a count of %d" % bucket_scale 165 | bucket_min = min_v 166 | bucket_max = min_v 167 | for bucket in range(buckets): 168 | bucket_min = bucket_max 169 | bucket_max = boundaries[bucket] 170 | bucket_count = bucket_counts[bucket] 171 | star_count = 0 172 | if bucket_count: 173 | star_count = bucket_count / bucket_scale 174 | print '%10.4f - %10.4f [%6d]: %s' % (bucket_min, bucket_max, bucket_count, '*' * star_count) 175 | 176 | 177 | if __name__ == "__main__": 178 | parser = OptionParser() 179 | parser.usage = "cat data | %prog [options]" 180 | parser.add_option("-m", "--min", dest="min", 181 | help="minimum value for graph") 182 | parser.add_option("-x", "--max", dest="max", 183 | help="maximum value for graph") 184 | parser.add_option("-b", "--buckets", dest="buckets", 185 | help="Number of buckets to use for the histogram") 186 | parser.add_option("--no-mvsd", dest="mvsd", action="store_false", default=True, 187 | help="Dissable the calculation of Mean, Vairance and SD. (improves performance)") 188 | 189 | (options, args) = parser.parse_args() 190 | if sys.stdin.isatty(): 191 | # if isatty() that means it's run without anything piped into it 192 | parser.print_usage() 193 | print "for more help use --help" 194 | sys.exit(1) 195 | histogram(load_stream(sys.stdin), options) 196 | 197 | -------------------------------------------------------------------------------- /solving_problems/scripts/ninety_five_percent.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Copyright 2010 bit.ly 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); you may 6 | # not use this file except in compliance with the License. You may obtain 7 | # a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 13 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 14 | # License for the specific language governing permissions and limitations 15 | # under the License. 16 | 17 | """ 18 | Calculate the 95% time from a list of times given on stdin 19 | 20 | http://github.com/bitly/data_hacks 21 | """ 22 | 23 | import sys 24 | import os 25 | from decimal import Decimal 26 | 27 | def run(): 28 | count = 0 29 | data = {} 30 | for line in sys.stdin: 31 | line = line.strip() 32 | if not line: 33 | # skip empty lines (ie: newlines) 34 | continue 35 | try: 36 | t = Decimal(line) 37 | except: 38 | print >>sys.stderr, "invalid line %r" % line 39 | count +=1 40 | data[t] = data.get(t, 0) + 1 41 | print calc_95(data, count) 42 | 43 | def calc_95(data, count): 44 | # find the time it took for x entry, where x is the threshold 45 | threshold = Decimal(count) * Decimal('.95') 46 | start = Decimal(0) 47 | times = data.keys() 48 | times.sort() 49 | for t in times: 50 | # increment our count by the # of items in this time bucket 51 | start += data[t] 52 | if start > threshold: 53 | return t 54 | 55 | if __name__ == "__main__": 56 | if sys.stdin.isatty() or '--help' in sys.argv or '-h' in sys.argv: 57 | print "Usage: cat data | %s" % os.path.basename(sys.argv[0]) 58 | sys.exit(1) 59 | run() 60 | -------------------------------------------------------------------------------- /solving_problems/simhashes.py: -------------------------------------------------------------------------------- 1 | from hashes.simhash import simhash 2 | 3 | if __name__ == '__main__': 4 | f = open('flat.txt', 'r') 5 | #f = open('thingiverse_all_names.csv') 6 | data = [line.strip() for line in f.readlines()] 7 | f.close() 8 | 9 | # print data 10 | all_hashes = dict([(d, simhash(d)) for d in data]) 11 | 12 | for k, h in all_hashes.items(): 13 | print "%s %s" % (k, h) 14 | print all_hashes['Flatpack Bunny'].similarity(h) 15 | 16 | -------------------------------------------------------------------------------- /solving_problems/thingiverse_liked_objects_1k.csv: -------------------------------------------------------------------------------- 1 | 3,28,1 2 | 2,9,0 3 | 2,5,0 4 | 2,10,1 5 | 9,14,1 6 | 3,6,0 7 | 11,26,1 8 | 3,11,1 9 | 24,5,0 10 | 1,17,0 11 | 1,2,0 12 | 37,10,1 13 | 47,0,0 14 | 12,3,0 15 | 1,2,0 16 | 1,0,0 17 | 1,6,0 18 | 1,27,0 19 | 1,1,0 20 | 1,2,0 21 | 15,13,1 22 | 1,11,0 23 | 1,0,0 24 | 2,7,0 25 | 1,1,0 26 | 59,16,1 27 | 20,1,0 28 | 37,75,1 29 | 2,0,0 30 | 20,9,1 31 | 1,1,0 32 | 3,7,1 33 | 37,4,1 34 | 2,0,0 35 | 63,1,0 36 | 64,3,0 37 | 65,1,0 38 | 66,0,0 39 | 37,29,0 40 | 37,42,0 41 | 67,1,0 42 | 37,0,0 43 | 37,21,1 44 | 83,27,0 45 | 93,1,0 46 | 107,1,0 47 | 107,1,0 48 | 107,1,0 49 | 107,1,0 50 | 107,1,0 51 | 107,1,0 52 | 109,6,1 53 | 107,8,0 54 | 107,3,0 55 | 107,1,0 56 | 107,2,0 57 | 107,2,0 58 | 107,3,0 59 | 107,23,0 60 | 107,3,0 61 | 107,3,0 62 | 107,1,0 63 | 121,1,0 64 | 1,3,0 65 | 136,0,0 66 | 133,2,0 67 | 1,0,0 68 | 11,5,0 69 | 113,6,0 70 | 1,13,0 71 | 189,7,0 72 | 37,7,0 73 | 46,4,0 74 | 181,9,0 75 | 2,3,0 76 | 196,2,0 77 | 41,0,0 78 | 67,1,0 79 | 109,4,0 80 | 3,20,1 81 | 75,6,0 82 | 15,4,0 83 | 207,2,0 84 | 216,0,0 85 | 212,2,0 86 | 218,0,0 87 | 71,3,0 88 | 213,17,1 89 | 219,0,0 90 | 220,1,0 91 | 212,3,1 92 | 214,1,0 93 | 217,38,1 94 | 46,0,0 95 | 1,1,0 96 | 1,32,0 97 | 159,2,0 98 | 5,1,0 99 | 71,13,1 100 | 71,0,0 101 | 71,0,0 102 | 226,0,0 103 | 226,0,0 104 | 226,0,0 105 | 226,0,0 106 | 226,0,0 107 | 75,13,1 108 | 226,0,0 109 | 226,0,0 110 | 223,6,1 111 | 222,2,0 112 | 225,0,0 113 | 226,0,0 114 | 224,3,0 115 | 226,0,0 116 | 228,2,0 117 | 229,50,1 118 | 228,6,1 119 | 2,12,1 120 | 181,5,0 121 | 75,3,0 122 | 1,3,0 123 | 75,24,0 124 | 37,40,1 125 | 169,33,1 126 | 75,39,1 127 | 75,18,1 128 | 169,13,1 129 | 244,2,1 130 | 75,1,0 131 | 75,3,1 132 | 239,9,0 133 | 3,22,1 134 | 3,112,1 135 | 189,5,0 136 | 75,4,0 137 | 223,4,0 138 | 265,13,1 139 | 265,11,1 140 | 265,1,0 141 | 265,0,0 142 | 265,1,0 143 | 265,3,0 144 | 223,4,0 145 | 272,7,0 146 | 67,0,0 147 | 67,0,0 148 | 243,9,1 149 | 239,4,1 150 | 3,1,0 151 | 227,0,0 152 | 227,4,0 153 | 83,3,0 154 | 1,1,0 155 | 287,0,0 156 | 288,2,0 157 | 293,1,0 158 | 225,7,1 159 | 11,10,1 160 | 287,0,0 161 | 67,0,0 162 | 227,0,0 163 | 67,0,0 164 | 296,26,1 165 | 227,0,0 166 | 55,15,1 167 | 287,0,0 168 | 287,0,0 169 | 45,9,1 170 | 296,19,1 171 | 55,2,1 172 | 1,1,0 173 | 287,0,0 174 | 3,3,0 175 | 322,9,1 176 | 239,5,1 177 | 307,3,0 178 | 2,2,0 179 | 2,1,0 180 | 345,5,0 181 | 1,1,0 182 | 296,4,0 183 | 239,15,1 184 | 353,1,0 185 | 354,0,0 186 | 357,0,0 187 | 288,4,1 188 | 67,0,0 189 | 67,0,0 190 | 239,3,0 191 | 368,0,0 192 | 368,0,0 193 | 366,0,0 194 | 366,0,0 195 | 1,0,0 196 | 367,18,1 197 | 1,1,0 198 | 357,1,0 199 | 1,2,0 200 | 371,1,0 201 | 67,0,0 202 | 213,0,0 203 | 1,9,1 204 | 239,11,1 205 | 84,0,0 206 | 379,5,0 207 | 384,2,0 208 | 384,1,0 209 | 2,9,0 210 | 12,1,0 211 | 3,1,0 212 | 83,1,0 213 | 400,0,0 214 | 400,0,0 215 | 407,0,0 216 | 401,13,1 217 | 402,1,1 218 | 404,0,0 219 | 407,1,0 220 | 410,0,0 221 | 406,5,1 222 | 37,4,0 223 | 239,15,1 224 | 181,0,0 225 | 413,2,0 226 | 413,0,0 227 | 113,3,0 228 | 415,0,0 229 | 265,3,0 230 | 239,3,0 231 | 1,0,0 232 | 1,1,0 233 | 109,14,1 234 | 428,5,1 235 | 1,0,0 236 | 2,0,0 237 | 1,0,0 238 | 1,2,0 239 | 2,10,0 240 | 239,4,1 241 | 189,5,0 242 | 239,31,1 243 | 52,0,0 244 | 396,3,0 245 | 428,2,0 246 | 443,4,1 247 | 3,0,0 248 | 239,3,1 249 | 443,0,0 250 | 450,0,0 251 | 451,0,0 252 | 451,0,0 253 | 451,0,0 254 | 453,0,0 255 | 450,0,0 256 | 4,0,0 257 | 55,1,0 258 | 2,8,0 259 | 239,8,0 260 | 428,9,0 261 | 428,11,0 262 | 428,17,0 263 | 3,7,0 264 | 375,2,0 265 | 375,31,1 266 | 3,1,0 267 | 55,8,1 268 | 55,4,0 269 | 1,0,0 270 | 1,0,0 271 | 1,11,0 272 | 1,1,0 273 | 479,12,1 274 | 1,3,0 275 | 41,0,0 276 | 41,0,0 277 | 479,2,0 278 | 1,0,0 279 | 479,3,1 280 | 390,1,0 281 | 1,1,0 282 | 1,1,0 283 | 479,7,0 284 | 134,11,0 285 | 479,0,0 286 | 1,3,0 287 | 479,9,0 288 | 239,54,0 289 | 479,11,0 290 | 479,3,0 291 | 479,8,1 292 | 928,7,0 293 | 928,5,0 294 | 928,7,0 295 | 479,2,0 296 | 67,0,0 297 | 491,6,1 298 | 479,1,0 299 | 1,2,0 300 | 479,5,0 301 | 239,16,1 302 | 134,29,1 303 | 20,13,1 304 | 501,12,0 305 | 46,0,0 306 | 479,12,0 307 | 375,16,0 308 | 479,6,0 309 | 450,0,0 310 | 1,3,0 311 | 55,133,1 312 | 11,2,0 313 | 134,8,0 314 | 11,4,1 315 | 109,16,1 316 | 75,2,1 317 | 928,15,1 318 | 75,1,0 319 | 512,3,0 320 | 75,6,1 321 | 181,3,0 322 | 479,9,0 323 | 479,1,0 324 | 11,3,0 325 | 75,0,0 326 | 479,8,0 327 | 479,0,0 328 | 479,4,0 329 | 75,1,1 330 | 75,22,1 331 | 928,9,0 332 | 928,7,0 333 | 479,2,0 334 | 75,4,0 335 | 1,3,0 336 | 479,1,0 337 | 75,7,1 338 | 134,21,0 339 | 375,4,0 340 | 479,15,0 341 | 75,0,0 342 | 479,6,0 343 | 528,18,1 344 | 75,11,1 345 | 479,11,1 346 | 75,8,0 347 | 75,5,0 348 | 479,6,0 349 | 531,27,1 350 | 531,4,1 351 | 531,15,1 352 | 928,139,1 353 | 181,10,0 354 | 479,9,1 355 | 1,27,0 356 | 479,11,1 357 | 479,8,0 358 | 479,1,0 359 | 535,6,1 360 | 479,7,0 361 | 225,0,0 362 | 46,5,0 363 | 1,5,0 364 | 1,20,0 365 | 3,3,0 366 | 239,5,0 367 | 145,0,0 368 | 479,4,0 369 | 479,0,0 370 | 479,1,0 371 | 75,3,0 372 | 2,28,0 373 | 296,53,1 374 | 479,10,0 375 | 1,8,0 376 | 479,5,0 377 | 75,9,1 378 | 75,17,1 379 | 479,4,0 380 | 75,0,0 381 | 75,4,0 382 | 75,5,0 383 | 554,8,1 384 | 75,10,1 385 | 134,32,1 386 | 75,10,0 387 | 181,2,0 388 | 390,4,0 389 | 479,3,0 390 | 1,4,0 391 | 479,3,0 392 | 75,1,0 393 | 554,15,0 394 | 75,11,1 395 | 367,13,0 396 | 1,10,0 397 | 367,3,0 398 | 479,33,0 399 | 367,11,0 400 | 1,11,0 401 | 296,8,0 402 | 564,11,0 403 | 367,10,1 404 | 479,3,0 405 | 569,2,0 406 | 1,5,0 407 | 479,15,0 408 | 580,0,0 409 | 582,0,0 410 | 517,2,0 411 | 517,3,0 412 | 579,0,0 413 | 579,1,0 414 | 579,2,0 415 | 584,1,0 416 | 583,4,0 417 | 1,6,0 418 | 479,6,1 419 | 75,1,0 420 | 586,13,0 421 | 479,5,0 422 | 75,2,0 423 | 479,15,0 424 | 75,25,0 425 | 479,6,0 426 | 479,9,0 427 | 546,4,0 428 | 467,15,0 429 | 2,10,0 430 | 554,4,0 431 | 367,5,0 432 | 390,12,0 433 | 75,0,0 434 | 577,4,0 435 | 367,5,1 436 | 309,6,1 437 | 1,1,0 438 | 591,5,0 439 | 390,0,0 440 | 374,1,0 441 | 591,0,0 442 | 75,10,0 443 | 374,4,0 444 | 75,10,0 445 | 602,10,0 446 | 367,8,0 447 | 497,6,0 448 | 20,2,0 449 | 602,1,0 450 | 390,1,0 451 | 607,18,0 452 | 610,56,1 453 | 546,4,0 454 | 591,30,1 455 | 20,56,1 456 | 1,14,1 457 | 55,3,0 458 | 55,172,1 459 | 374,4,0 460 | 616,10,0 461 | 296,15,0 462 | 374,17,0 463 | 618,8,0 464 | 479,0,0 465 | 535,6,0 466 | 512,0,0 467 | 479,2,0 468 | 512,1,0 469 | 512,1,0 470 | 20,11,1 471 | 2,32,1 472 | 512,0,0 473 | 629,4,0 474 | 546,2,0 475 | 631,2,0 476 | 192,29,1 477 | 512,7,0 478 | 577,5,1 479 | 512,3,0 480 | 577,0,0 481 | 479,1,0 482 | 1,44,1 483 | 642,1,0 484 | 467,27,0 485 | 374,4,0 486 | 479,1,0 487 | 656,2,0 488 | 479,13,0 489 | 656,2,0 490 | 635,47,1 491 | 656,0,0 492 | 1,14,0 493 | 657,0,0 494 | 657,0,0 495 | 491,33,1 496 | 2,6,0 497 | 479,0,0 498 | 577,2,0 499 | 55,9,1 500 | 309,4,0 501 | 2,27,0 502 | 242,0,0 503 | 222,28,0 504 | 481,3,0 505 | 3,7,0 506 | 683,16,1 507 | 683,22,0 508 | 512,5,0 509 | 512,0,0 510 | 512,1,0 511 | 435,0,0 512 | 692,0,0 513 | 689,0,0 514 | 1,23,1 515 | 690,0,0 516 | 690,0,0 517 | 692,5,1 518 | 691,8,1 519 | 693,8,0 520 | 134,48,1 521 | 635,33,1 522 | 367,13,1 523 | 701,20,1 524 | 706,5,0 525 | 308,45,1 526 | 134,14,0 527 | 390,1,0 528 | 225,1,0 529 | 83,9,1 530 | 83,6,0 531 | 721,9,0 532 | 717,51,1 533 | 2,0,0 534 | 717,39,1 535 | 512,0,0 536 | 390,1,0 537 | 467,1,0 538 | 554,3,0 539 | 396,3,0 540 | 87,4,0 541 | 288,10,1 542 | 239,12,0 543 | 686,2,0 544 | 737,14,0 545 | 607,15,0 546 | 367,11,1 547 | 390,0,0 548 | 747,1,0 549 | 375,27,0 550 | 748,0,0 551 | 749,0,0 552 | 750,0,0 553 | 743,0,0 554 | 746,0,0 555 | 745,0,0 556 | 744,0,0 557 | 591,2,0 558 | 586,26,1 559 | 753,2,0 560 | 756,7,1 561 | 756,5,0 562 | 759,8,0 563 | 374,25,0 564 | 479,7,0 565 | 479,0,0 566 | 683,22,1 567 | 479,15,0 568 | 1,38,1 569 | 367,18,0 570 | 367,48,0 571 | 583,2,0 572 | 374,26,0 573 | 750,3,0 574 | 928,17,0 575 | 928,4,0 576 | 928,13,0 577 | 607,5,1 578 | 769,1,0 579 | 428,11,0 580 | 607,4,0 581 | 181,22,0 582 | 791,8,0 583 | 793,4,0 584 | 793,6,0 585 | 794,0,0 586 | 781,11,0 587 | 781,20,0 588 | 12,12,0 589 | 479,1,0 590 | 479,11,1 591 | 134,21,1 592 | 801,19,1 593 | 479,4,1 594 | 802,13,0 595 | 506,4,0 596 | 12,5,0 597 | 546,1,0 598 | 793,8,0 599 | 134,4,0 600 | 367,59,1 601 | 506,9,0 602 | 181,4,0 603 | 367,47,1 604 | 769,7,1 605 | 691,2,0 606 | 546,0,0 607 | 267,18,1 608 | 822,20,1 609 | 656,2,0 610 | 795,50,0 611 | 828,1,1 612 | 656,6,0 613 | 793,6,0 614 | 87,11,1 615 | 586,20,0 616 | 717,8,0 617 | 37,3,0 618 | 834,4,1 619 | 834,4,1 620 | 834,0,0 621 | 813,32,1 622 | 844,6,1 623 | 134,4,0 624 | 848,1,0 625 | 309,1,0 626 | 779,2,0 627 | 3,31,1 628 | 850,31,0 629 | 610,1,0 630 | 848,11,1 631 | 850,27,0 632 | 851,25,1 633 | 852,0,0 634 | 852,1,0 635 | 793,18,1 636 | 367,11,0 637 | 713,2,0 638 | 854,4,0 639 | 848,13,0 640 | 586,14,0 641 | 854,10,0 642 | 848,0,0 643 | 374,25,1 644 | 857,2,0 645 | 288,15,1 646 | 854,5,0 647 | 867,17,1 648 | 479,5,0 649 | 367,13,1 650 | 870,3,0 651 | 367,20,0 652 | 828,27,1 653 | 788,0,0 654 | 875,60,1 655 | 800,45,1 656 | 788,4,0 657 | 793,8,0 658 | 815,5,0 659 | 479,0,0 660 | 428,14,0 661 | 616,10,0 662 | 800,21,0 663 | 848,4,0 664 | 428,19,0 665 | 884,19,0 666 | 793,11,0 667 | 848,6,0 668 | 888,24,0 669 | 888,20,0 670 | 848,0,0 671 | 55,1,0 672 | 309,3,0 673 | 895,6,0 674 | 848,3,0 675 | 870,2,1 676 | 270,11,0 677 | 55,10,0 678 | 265,6,0 679 | 64,3,0 680 | 902,1,0 681 | 367,11,0 682 | 181,1,0 683 | 854,26,0 684 | 2,7,0 685 | 55,2,0 686 | 834,1,0 687 | 872,19,0 688 | 55,8,0 689 | 55,7,0 690 | 55,40,1 691 | 795,9,0 692 | 3,10,0 693 | 55,13,0 694 | 888,106,1 695 | 834,2,0 696 | 479,4,1 697 | 788,2,0 698 | 854,17,1 699 | 920,22,1 700 | 923,2,0 701 | 914,12,1 702 | 920,17,0 703 | 659,26,1 704 | 920,8,1 705 | 683,25,0 706 | 793,9,0 707 | 795,26,0 708 | 428,9,0 709 | 872,42,0 710 | 850,0,0 711 | 374,12,0 712 | 850,6,0 713 | 367,21,0 714 | 860,14,0 715 | 940,5,0 716 | 951,0,0 717 | 949,6,0 718 | 956,0,0 719 | 956,1,0 720 | 940,2,0 721 | 962,10,1 722 | 713,4,0 723 | 851,8,0 724 | 928,7,0 725 | 55,5,0 726 | 55,9,0 727 | 967,1,0 728 | 860,6,0 729 | 872,18,1 730 | 870,6,0 731 | 506,10,0 732 | 854,14,1 733 | 659,6,1 734 | 2,10,0 735 | 428,11,0 736 | 467,13,0 737 | 467,4,1 738 | 467,19,0 739 | 967,3,0 740 | 860,54,1 741 | 870,2,0 742 | 918,7,0 743 | 918,8,1 744 | 55,21,1 745 | 983,4,1 746 | 928,11,0 747 | 848,10,1 748 | 55,12,0 749 | 1025,2,1 750 | 1027,54,0 751 | 390,4,0 752 | 929,1,0 753 | 1029,23,0 754 | 860,119,1 755 | 37,16,1 756 | 860,79,1 757 | 2,45,1 758 | 971,10,0 759 | 795,42,1 760 | 793,5,0 761 | 396,0,0 762 | 857,53,1 763 | 932,1,0 764 | 932,2,0 765 | 497,15,1 766 | 954,11,1 767 | 920,14,1 768 | 932,1,0 769 | 952,2,0 770 | 952,3,0 771 | 1056,4,0 772 | 1058,6,0 773 | 1032,40,0 774 | 367,5,0 775 | 860,9,0 776 | 181,6,0 777 | 1032,18,1 778 | 181,4,0 779 | 1069,16,0 780 | 1069,2,0 781 | 586,21,0 782 | 592,3,0 783 | 1051,7,0 784 | 1069,38,0 785 | 481,51,1 786 | 592,4,0 787 | 1077,19,1 788 | 872,119,1 789 | 954,14,0 790 | 396,11,0 791 | 591,8,0 792 | 1091,2,0 793 | 604,11,0 794 | 604,13,0 795 | 375,5,0 796 | 1098,0,0 797 | 1098,0,0 798 | 1098,0,0 799 | 396,13,0 800 | 744,0,0 801 | 744,0,0 802 | 1101,0,0 803 | 744,0,0 804 | 848,5,1 805 | 1102,0,0 806 | 1104,0,0 807 | 1104,7,0 808 | 396,1,0 809 | 396,5,0 810 | 396,3,0 811 | 55,15,0 812 | 1097,1,0 813 | 1111,1,0 814 | 1114,1,0 815 | 972,2,0 816 | 1111,11,1 817 | 860,165,1 818 | 691,2,0 819 | 134,21,0 820 | 1129,34,0 821 | 872,13,0 822 | 3,49,0 823 | 1134,1,0 824 | 954,4,0 825 | 791,0,0 826 | 791,15,0 827 | 1111,4,0 828 | 932,2,0 829 | 55,4,0 830 | 940,13,0 831 | 592,32,0 832 | 1144,2,0 833 | 481,39,1 834 | 360,11,0 835 | 270,14,1 836 | 360,9,0 837 | 360,2,0 838 | 1151,4,0 839 | 713,9,0 840 | 360,3,0 841 | 55,5,0 842 | 954,2,0 843 | 1151,1,0 844 | 1162,1,0 845 | 954,0,0 846 | 713,18,0 847 | 55,11,0 848 | 428,30,1 849 | 713,16,0 850 | 860,15,0 851 | 55,4,0 852 | 860,30,0 853 | 1111,16,0 854 | 506,8,0 855 | 3,23,1 856 | 872,46,1 857 | 1176,16,0 858 | 848,4,0 859 | 860,49,0 860 | 1131,27,1 861 | 1047,2,0 862 | 1168,1,0 863 | 55,6,0 864 | 1045,40,0 865 | 360,3,0 866 | 360,3,0 867 | 428,3,0 868 | 535,3,0 869 | 854,14,0 870 | 428,13,0 871 | 1111,0,0 872 | 1111,7,0 873 | 635,36,0 874 | 788,8,0 875 | 1045,10,0 876 | 1045,7,0 877 | 1199,1,0 878 | 55,1,0 879 | 1199,5,0 880 | 428,26,0 881 | 181,22,0 882 | 661,14,1 883 | 481,46,1 884 | 467,20,0 885 | 1032,10,0 886 | 1212,11,0 887 | 1212,8,0 888 | 711,5,0 889 | 1212,9,0 890 | 134,27,0 891 | 928,4,0 892 | 1218,23,1 893 | 447,6,0 894 | 610,14,0 895 | 428,10,0 896 | 1225,41,0 897 | 1225,8,1 898 | 928,45,1 899 | 1225,23,1 900 | 1154,3,0 901 | 1154,11,1 902 | 1122,2,0 903 | 203,11,1 904 | 428,21,1 905 | 1122,8,0 906 | 661,14,0 907 | 872,19,1 908 | 1243,0,0 909 | 1245,6,0 910 | 661,9,0 911 | 793,27,0 912 | 428,10,0 913 | 1244,9,0 914 | 1247,2,0 915 | 1252,9,1 916 | 1058,19,0 917 | 1257,6,0 918 | 1260,24,0 919 | 428,39,0 920 | 107,61,0 921 | 1,9,0 922 | 1266,0,0 923 | 1154,8,0 924 | 920,72,0 925 | 1188,8,0 926 | 1262,3,0 927 | 1270,2,0 928 | 1144,19,0 929 | 895,3,0 930 | 367,28,0 931 | 1097,0,0 932 | 1274,7,0 933 | 1226,2,0 934 | 793,19,0 935 | 928,7,0 936 | 1154,3,0 937 | 1297,5,0 938 | 1297,2,0 939 | 1290,24,1 940 | 309,24,1 941 | 1297,6,0 942 | 546,34,0 943 | 1270,0,0 944 | 1299,19,1 945 | 375,22,0 946 | 1266,120,0 947 | 691,2,0 948 | 1301,19,1 949 | 1238,5,0 950 | 1176,1,0 951 | 1,13,0 952 | 1305,4,0 953 | 1,43,0 954 | 1151,1,0 955 | 725,3,0 956 | 844,0,0 957 | 55,15,0 958 | 1151,0,0 959 | 569,4,0 960 | 1288,4,0 961 | 1,22,1 962 | 1314,5,0 963 | 571,2,0 964 | 1064,1,0 965 | 428,0,0 966 | 467,7,0 967 | 428,1,0 968 | 872,9,0 969 | 854,62,1 970 | 1219,2,0 971 | 134,6,1 972 | 802,0,0 973 | 928,19,0 974 | 467,9,1 975 | 1219,14,0 976 | 1219,13,0 977 | 571,10,0 978 | 1226,2,0 979 | 214,0,0 980 | 791,2,0 981 | 1326,0,0 982 | 1178,16,0 983 | 1314,1,0 984 | 1111,81,0 985 | 1113,4,0 986 | 920,65,0 987 | 1212,3,0 988 | 788,11,0 989 | 181,16,0 990 | 860,48,0 991 | 895,7,0 992 | 1329,33,0 993 | 1329,20,0 994 | 1329,31,0 995 | 1329,7,0 996 | 1329,6,0 997 | 1329,4,0 998 | 928,64,1 999 | 1352,8,0 1000 | 1176,5,0 1001 | -------------------------------------------------------------------------------- /solving_problems/thingiverse_tree.dot: -------------------------------------------------------------------------------- 1 | digraph Tree { 2 | 0 [label="num_likes <= 8.5\nerror = 0.3432\nsamples = 1000\nvalue = [ 780. 220.]"] ; 3 | 1 [label="num_likes <= 3.5\nerror = 0.162208\nsamples = 629\nvalue = [ 573. 56.]"] ; 4 | 122 [label="user_id <= 341.0\nerror = 0.493283\nsamples = 371\nvalue = [ 207. 164.]"] ; 5 | 0 -> 1 ; 6 | 0 -> 122 ; 7 | 1 [label="num_likes <= 3.5\nerror = 0.162208\nsamples = 629\nvalue = [ 573. 56.]"] ; 8 | 2 [label="num_likes <= 1.5\nerror = 0.0575034\nsamples = 405\nvalue = [ 393. 12.]"] ; 9 | 43 [label="user_id <= 924.0\nerror = 0.315689\nsamples = 224\nvalue = [ 180. 44.]"] ; 10 | 1 -> 2 ; 11 | 1 -> 43 ; 12 | 2 [label="num_likes <= 1.5\nerror = 0.0575034\nsamples = 405\nvalue = [ 393. 12.]"] ; 13 | 3 [label="num_likes <= 0.5\nerror = 0.0227243\nsamples = 261\nvalue = [ 258. 3.]"] ; 14 | 18 [label="user_id <= 254.5\nerror = 0.117188\nsamples = 144\nvalue = [ 135. 9.]"] ; 15 | 2 -> 3 ; 16 | 2 -> 18 ; 17 | 3 [label="num_likes <= 0.5\nerror = 0.0227243\nsamples = 261\nvalue = [ 258. 3.]"] ; 18 | 4 [label="error = 0.0\nsamples = 155\nvalue = [ 155. 0.]"] ; 19 | 5 [label="user_id <= 71.0\nerror = 0.0550018\nsamples = 106\nvalue = [ 103. 3.]"] ; 20 | 3 -> 4 ; 21 | 3 -> 5 ; 22 | 5 [label="user_id <= 71.0\nerror = 0.0550018\nsamples = 106\nvalue = [ 103. 3.]"] ; 23 | 6 [label="error = 0.0\nsamples = 27\nvalue = [ 27. 0.]"] ; 24 | 7 [label="user_id <= 79.0\nerror = 0.0730652\nsamples = 79\nvalue = [ 76. 3.]"] ; 25 | 5 -> 6 ; 26 | 5 -> 7 ; 27 | 7 [label="user_id <= 79.0\nerror = 0.0730652\nsamples = 79\nvalue = [ 76. 3.]"] ; 28 | 8 [label="error = 0.32\nsamples = 5\nvalue = [ 4. 1.]"] ; 29 | 9 [label="user_id <= 399.0\nerror = 0.0525931\nsamples = 74\nvalue = [ 72. 2.]"] ; 30 | 7 -> 8 ; 31 | 7 -> 9 ; 32 | 9 [label="user_id <= 399.0\nerror = 0.0525931\nsamples = 74\nvalue = [ 72. 2.]"] ; 33 | 10 [label="error = 0.0\nsamples = 29\nvalue = [ 29. 0.]"] ; 34 | 11 [label="user_id <= 404.5\nerror = 0.0849383\nsamples = 45\nvalue = [ 43. 2.]"] ; 35 | 9 -> 10 ; 36 | 9 -> 11 ; 37 | 11 [label="user_id <= 404.5\nerror = 0.0849383\nsamples = 45\nvalue = [ 43. 2.]"] ; 38 | 12 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 39 | 13 [label="user_id <= 798.5\nerror = 0.0444215\nsamples = 44\nvalue = [ 43. 1.]"] ; 40 | 11 -> 12 ; 41 | 11 -> 13 ; 42 | 13 [label="user_id <= 798.5\nerror = 0.0444215\nsamples = 44\nvalue = [ 43. 1.]"] ; 43 | 14 [label="error = 0.0\nsamples = 22\nvalue = [ 22. 0.]"] ; 44 | 15 [label="user_id <= 831.0\nerror = 0.0867769\nsamples = 22\nvalue = [ 21. 1.]"] ; 45 | 13 -> 14 ; 46 | 13 -> 15 ; 47 | 15 [label="user_id <= 831.0\nerror = 0.0867769\nsamples = 22\nvalue = [ 21. 1.]"] ; 48 | 16 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 49 | 17 [label="error = 0.0\nsamples = 21\nvalue = [ 21. 0.]"] ; 50 | 15 -> 16 ; 51 | 15 -> 17 ; 52 | 18 [label="user_id <= 254.5\nerror = 0.117188\nsamples = 144\nvalue = [ 135. 9.]"] ; 53 | 19 [label="user_id <= 241.5\nerror = 0.19438\nsamples = 55\nvalue = [ 49. 6.]"] ; 54 | 30 [label="user_id <= 863.5\nerror = 0.0651433\nsamples = 89\nvalue = [ 86. 3.]"] ; 55 | 18 -> 19 ; 56 | 18 -> 30 ; 57 | 19 [label="user_id <= 241.5\nerror = 0.19438\nsamples = 55\nvalue = [ 49. 6.]"] ; 58 | 20 [label="user_id <= 46.0\nerror = 0.168038\nsamples = 54\nvalue = [ 49. 5.]"] ; 59 | 29 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 60 | 19 -> 20 ; 61 | 19 -> 29 ; 62 | 20 [label="user_id <= 46.0\nerror = 0.168038\nsamples = 54\nvalue = [ 49. 5.]"] ; 63 | 21 [label="error = 0.0\nsamples = 21\nvalue = [ 21. 0.]"] ; 64 | 22 [label="user_id <= 79.0\nerror = 0.257117\nsamples = 33\nvalue = [ 28. 5.]"] ; 65 | 20 -> 21 ; 66 | 20 -> 22 ; 67 | 22 [label="user_id <= 79.0\nerror = 0.257117\nsamples = 33\nvalue = [ 28. 5.]"] ; 68 | 23 [label="num_likes <= 2.5\nerror = 0.396694\nsamples = 11\nvalue = [ 8. 3.]"] ; 69 | 26 [label="user_id <= 209.5\nerror = 0.165289\nsamples = 22\nvalue = [ 20. 2.]"] ; 70 | 22 -> 23 ; 71 | 22 -> 26 ; 72 | 23 [label="num_likes <= 2.5\nerror = 0.396694\nsamples = 11\nvalue = [ 8. 3.]"] ; 73 | 24 [label="error = 0.5\nsamples = 4\nvalue = [ 2. 2.]"] ; 74 | 25 [label="error = 0.244898\nsamples = 7\nvalue = [ 6. 1.]"] ; 75 | 23 -> 24 ; 76 | 23 -> 25 ; 77 | 26 [label="user_id <= 209.5\nerror = 0.165289\nsamples = 22\nvalue = [ 20. 2.]"] ; 78 | 27 [label="error = 0.0\nsamples = 14\nvalue = [ 14. 0.]"] ; 79 | 28 [label="error = 0.375\nsamples = 8\nvalue = [ 6. 2.]"] ; 80 | 26 -> 27 ; 81 | 26 -> 28 ; 82 | 30 [label="user_id <= 863.5\nerror = 0.0651433\nsamples = 89\nvalue = [ 86. 3.]"] ; 83 | 31 [label="user_id <= 480.0\nerror = 0.0327778\nsamples = 60\nvalue = [ 59. 1.]"] ; 84 | 36 [label="user_id <= 882.5\nerror = 0.128419\nsamples = 29\nvalue = [ 27. 2.]"] ; 85 | 30 -> 31 ; 86 | 30 -> 36 ; 87 | 31 [label="user_id <= 480.0\nerror = 0.0327778\nsamples = 60\nvalue = [ 59. 1.]"] ; 88 | 32 [label="user_id <= 453.5\nerror = 0.0713306\nsamples = 27\nvalue = [ 26. 1.]"] ; 89 | 35 [label="error = 0.0\nsamples = 33\nvalue = [ 33. 0.]"] ; 90 | 31 -> 32 ; 91 | 31 -> 35 ; 92 | 32 [label="user_id <= 453.5\nerror = 0.0713306\nsamples = 27\nvalue = [ 26. 1.]"] ; 93 | 33 [label="error = 0.0\nsamples = 18\nvalue = [ 18. 0.]"] ; 94 | 34 [label="error = 0.197531\nsamples = 9\nvalue = [ 8. 1.]"] ; 95 | 32 -> 33 ; 96 | 32 -> 34 ; 97 | 36 [label="user_id <= 882.5\nerror = 0.128419\nsamples = 29\nvalue = [ 27. 2.]"] ; 98 | 37 [label="error = 0.444444\nsamples = 3\nvalue = [ 2. 1.]"] ; 99 | 38 [label="user_id <= 1036.0\nerror = 0.0739645\nsamples = 26\nvalue = [ 25. 1.]"] ; 100 | 36 -> 37 ; 101 | 36 -> 38 ; 102 | 38 [label="user_id <= 1036.0\nerror = 0.0739645\nsamples = 26\nvalue = [ 25. 1.]"] ; 103 | 39 [label="user_id <= 998.5\nerror = 0.165289\nsamples = 11\nvalue = [ 10. 1.]"] ; 104 | 42 [label="error = 0.0\nsamples = 15\nvalue = [ 15. 0.]"] ; 105 | 38 -> 39 ; 106 | 38 -> 42 ; 107 | 39 [label="user_id <= 998.5\nerror = 0.165289\nsamples = 11\nvalue = [ 10. 1.]"] ; 108 | 40 [label="error = 0.0\nsamples = 10\nvalue = [ 10. 0.]"] ; 109 | 41 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 110 | 39 -> 40 ; 111 | 39 -> 41 ; 112 | 43 [label="user_id <= 924.0\nerror = 0.315689\nsamples = 224\nvalue = [ 180. 44.]"] ; 113 | 44 [label="user_id <= 206.0\nerror = 0.353668\nsamples = 183\nvalue = [ 141. 42.]"] ; 114 | 115 [label="num_likes <= 7.5\nerror = 0.0928019\nsamples = 41\nvalue = [ 39. 2.]"] ; 115 | 43 -> 44 ; 116 | 43 -> 115 ; 117 | 44 [label="user_id <= 206.0\nerror = 0.353668\nsamples = 183\nvalue = [ 141. 42.]"] ; 118 | 45 [label="num_likes <= 5.5\nerror = 0.231111\nsamples = 60\nvalue = [ 52. 8.]"] ; 119 | 62 [label="user_id <= 252.0\nerror = 0.400026\nsamples = 123\nvalue = [ 89. 34.]"] ; 120 | 44 -> 45 ; 121 | 44 -> 62 ; 122 | 45 [label="num_likes <= 5.5\nerror = 0.231111\nsamples = 60\nvalue = [ 52. 8.]"] ; 123 | 46 [label="user_id <= 41.5\nerror = 0.117188\nsamples = 32\nvalue = [ 30. 2.]"] ; 124 | 51 [label="user_id <= 2.5\nerror = 0.336735\nsamples = 28\nvalue = [ 22. 6.]"] ; 125 | 45 -> 46 ; 126 | 45 -> 51 ; 127 | 46 [label="user_id <= 41.5\nerror = 0.117188\nsamples = 32\nvalue = [ 30. 2.]"] ; 128 | 47 [label="num_likes <= 4.5\nerror = 0.297521\nsamples = 11\nvalue = [ 9. 2.]"] ; 129 | 50 [label="error = 0.0\nsamples = 21\nvalue = [ 21. 0.]"] ; 130 | 46 -> 47 ; 131 | 46 -> 50 ; 132 | 47 [label="num_likes <= 4.5\nerror = 0.297521\nsamples = 11\nvalue = [ 9. 2.]"] ; 133 | 48 [label="error = 0.48\nsamples = 5\nvalue = [ 3. 2.]"] ; 134 | 49 [label="error = 0.0\nsamples = 6\nvalue = [ 6. 0.]"] ; 135 | 47 -> 48 ; 136 | 47 -> 49 ; 137 | 51 [label="user_id <= 2.5\nerror = 0.336735\nsamples = 28\nvalue = [ 22. 6.]"] ; 138 | 52 [label="error = 0.0\nsamples = 7\nvalue = [ 7. 0.]"] ; 139 | 53 [label="user_id <= 157.5\nerror = 0.408163\nsamples = 21\nvalue = [ 15. 6.]"] ; 140 | 51 -> 52 ; 141 | 51 -> 53 ; 142 | 53 [label="user_id <= 157.5\nerror = 0.408163\nsamples = 21\nvalue = [ 15. 6.]"] ; 143 | 54 [label="user_id <= 108.0\nerror = 0.432133\nsamples = 19\nvalue = [ 13. 6.]"] ; 144 | 61 [label="error = 0.0\nsamples = 2\nvalue = [ 2. 0.]"] ; 145 | 53 -> 54 ; 146 | 53 -> 61 ; 147 | 54 [label="user_id <= 108.0\nerror = 0.432133\nsamples = 19\nvalue = [ 13. 6.]"] ; 148 | 55 [label="user_id <= 79.0\nerror = 0.391111\nsamples = 15\nvalue = [ 11. 4.]"] ; 149 | 60 [label="error = 0.5\nsamples = 4\nvalue = [ 2. 2.]"] ; 150 | 54 -> 55 ; 151 | 54 -> 60 ; 152 | 55 [label="user_id <= 79.0\nerror = 0.391111\nsamples = 15\nvalue = [ 11. 4.]"] ; 153 | 56 [label="user_id <= 65.0\nerror = 0.426035\nsamples = 13\nvalue = [ 9. 4.]"] ; 154 | 59 [label="error = 0.0\nsamples = 2\nvalue = [ 2. 0.]"] ; 155 | 55 -> 56 ; 156 | 55 -> 59 ; 157 | 56 [label="user_id <= 65.0\nerror = 0.426035\nsamples = 13\nvalue = [ 9. 4.]"] ; 158 | 57 [label="error = 0.345679\nsamples = 9\nvalue = [ 7. 2.]"] ; 159 | 58 [label="error = 0.5\nsamples = 4\nvalue = [ 2. 2.]"] ; 160 | 56 -> 57 ; 161 | 56 -> 58 ; 162 | 62 [label="user_id <= 252.0\nerror = 0.400026\nsamples = 123\nvalue = [ 89. 34.]"] ; 163 | 63 [label="num_likes <= 7.5\nerror = 0.495868\nsamples = 11\nvalue = [ 5. 6.]"] ; 164 | 68 [label="user_id <= 919.0\nerror = 0.375\nsamples = 112\nvalue = [ 84. 28.]"] ; 165 | 62 -> 63 ; 166 | 62 -> 68 ; 167 | 63 [label="num_likes <= 7.5\nerror = 0.495868\nsamples = 11\nvalue = [ 5. 6.]"] ; 168 | 64 [label="num_likes <= 5.5\nerror = 0.48\nsamples = 10\nvalue = [ 4. 6.]"] ; 169 | 67 [label="error = 0.0\nsamples = 1\nvalue = [ 1. 0.]"] ; 170 | 63 -> 64 ; 171 | 63 -> 67 ; 172 | 64 [label="num_likes <= 5.5\nerror = 0.48\nsamples = 10\nvalue = [ 4. 6.]"] ; 173 | 65 [label="error = 0.489796\nsamples = 7\nvalue = [ 4. 3.]"] ; 174 | 66 [label="error = 0.0\nsamples = 3\nvalue = [ 0. 3.]"] ; 175 | 64 -> 65 ; 176 | 64 -> 66 ; 177 | 68 [label="user_id <= 919.0\nerror = 0.375\nsamples = 112\nvalue = [ 84. 28.]"] ; 178 | 69 [label="user_id <= 401.0\nerror = 0.368152\nsamples = 111\nvalue = [ 84. 27.]"] ; 179 | 114 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 180 | 68 -> 69 ; 181 | 68 -> 114 ; 182 | 69 [label="user_id <= 401.0\nerror = 0.368152\nsamples = 111\nvalue = [ 84. 27.]"] ; 183 | 70 [label="user_id <= 370.5\nerror = 0.244898\nsamples = 21\nvalue = [ 18. 3.]"] ; 184 | 75 [label="user_id <= 445.0\nerror = 0.391111\nsamples = 90\nvalue = [ 66. 24.]"] ; 185 | 69 -> 70 ; 186 | 69 -> 75 ; 187 | 70 [label="user_id <= 370.5\nerror = 0.244898\nsamples = 21\nvalue = [ 18. 3.]"] ; 188 | 71 [label="num_likes <= 6.5\nerror = 0.375\nsamples = 12\nvalue = [ 9. 3.]"] ; 189 | 74 [label="error = 0.0\nsamples = 9\nvalue = [ 9. 0.]"] ; 190 | 70 -> 71 ; 191 | 70 -> 74 ; 192 | 71 [label="num_likes <= 6.5\nerror = 0.375\nsamples = 12\nvalue = [ 9. 3.]"] ; 193 | 72 [label="error = 0.444444\nsamples = 9\nvalue = [ 6. 3.]"] ; 194 | 73 [label="error = 0.0\nsamples = 3\nvalue = [ 3. 0.]"] ; 195 | 71 -> 72 ; 196 | 71 -> 73 ; 197 | 75 [label="user_id <= 445.0\nerror = 0.391111\nsamples = 90\nvalue = [ 66. 24.]"] ; 198 | 76 [label="error = 0.0\nsamples = 3\nvalue = [ 0. 3.]"] ; 199 | 77 [label="user_id <= 849.0\nerror = 0.366231\nsamples = 87\nvalue = [ 66. 21.]"] ; 200 | 75 -> 76 ; 201 | 75 -> 77 ; 202 | 77 [label="user_id <= 849.0\nerror = 0.366231\nsamples = 87\nvalue = [ 66. 21.]"] ; 203 | 78 [label="user_id <= 824.5\nerror = 0.384551\nsamples = 77\nvalue = [ 57. 20.]"] ; 204 | 111 [label="user_id <= 906.5\nerror = 0.18\nsamples = 10\nvalue = [ 9. 1.]"] ; 205 | 77 -> 78 ; 206 | 77 -> 111 ; 207 | 78 [label="user_id <= 824.5\nerror = 0.384551\nsamples = 77\nvalue = [ 57. 20.]"] ; 208 | 79 [label="user_id <= 778.5\nerror = 0.352653\nsamples = 70\nvalue = [ 54. 16.]"] ; 209 | 110 [label="error = 0.489796\nsamples = 7\nvalue = [ 3. 4.]"] ; 210 | 78 -> 79 ; 211 | 78 -> 110 ; 212 | 79 [label="user_id <= 778.5\nerror = 0.352653\nsamples = 70\nvalue = [ 54. 16.]"] ; 213 | 80 [label="user_id <= 764.0\nerror = 0.391111\nsamples = 60\nvalue = [ 44. 16.]"] ; 214 | 109 [label="error = 0.0\nsamples = 10\nvalue = [ 10. 0.]"] ; 215 | 79 -> 80 ; 216 | 79 -> 109 ; 217 | 80 [label="user_id <= 764.0\nerror = 0.391111\nsamples = 60\nvalue = [ 44. 16.]"] ; 218 | 81 [label="user_id <= 657.5\nerror = 0.379201\nsamples = 59\nvalue = [ 44. 15.]"] ; 219 | 108 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 220 | 80 -> 81 ; 221 | 80 -> 108 ; 222 | 81 [label="user_id <= 657.5\nerror = 0.379201\nsamples = 59\nvalue = [ 44. 15.]"] ; 223 | 82 [label="user_id <= 612.5\nerror = 0.353299\nsamples = 48\nvalue = [ 37. 11.]"] ; 224 | 105 [label="user_id <= 692.5\nerror = 0.46281\nsamples = 11\nvalue = [ 7. 4.]"] ; 225 | 81 -> 82 ; 226 | 81 -> 105 ; 227 | 82 [label="user_id <= 612.5\nerror = 0.353299\nsamples = 48\nvalue = [ 37. 11.]"] ; 228 | 83 [label="user_id <= 599.5\nerror = 0.369383\nsamples = 45\nvalue = [ 34. 11.]"] ; 229 | 104 [label="error = 0.0\nsamples = 3\nvalue = [ 3. 0.]"] ; 230 | 82 -> 83 ; 231 | 82 -> 104 ; 232 | 83 [label="user_id <= 599.5\nerror = 0.369383\nsamples = 45\nvalue = [ 34. 11.]"] ; 233 | 84 [label="user_id <= 580.0\nerror = 0.35695\nsamples = 43\nvalue = [ 33. 10.]"] ; 234 | 103 [label="error = 0.5\nsamples = 2\nvalue = [ 1. 1.]"] ; 235 | 83 -> 84 ; 236 | 83 -> 103 ; 237 | 84 [label="user_id <= 580.0\nerror = 0.35695\nsamples = 43\nvalue = [ 33. 10.]"] ; 238 | 85 [label="user_id <= 521.5\nerror = 0.381328\nsamples = 39\nvalue = [ 29. 10.]"] ; 239 | 102 [label="error = 0.0\nsamples = 4\nvalue = [ 4. 0.]"] ; 240 | 84 -> 85 ; 241 | 84 -> 102 ; 242 | 85 [label="user_id <= 521.5\nerror = 0.381328\nsamples = 39\nvalue = [ 29. 10.]"] ; 243 | 86 [label="num_likes <= 4.5\nerror = 0.328181\nsamples = 29\nvalue = [ 23. 6.]"] ; 244 | 99 [label="num_likes <= 4.5\nerror = 0.48\nsamples = 10\nvalue = [ 6. 4.]"] ; 245 | 85 -> 86 ; 246 | 85 -> 99 ; 247 | 86 [label="num_likes <= 4.5\nerror = 0.328181\nsamples = 29\nvalue = [ 23. 6.]"] ; 248 | 87 [label="error = 0.489796\nsamples = 7\nvalue = [ 4. 3.]"] ; 249 | 88 [label="num_likes <= 5.5\nerror = 0.235537\nsamples = 22\nvalue = [ 19. 3.]"] ; 250 | 86 -> 87 ; 251 | 86 -> 88 ; 252 | 88 [label="num_likes <= 5.5\nerror = 0.235537\nsamples = 22\nvalue = [ 19. 3.]"] ; 253 | 89 [label="error = 0.0\nsamples = 5\nvalue = [ 5. 0.]"] ; 254 | 90 [label="user_id <= 494.0\nerror = 0.290657\nsamples = 17\nvalue = [ 14. 3.]"] ; 255 | 88 -> 89 ; 256 | 88 -> 90 ; 257 | 90 [label="user_id <= 494.0\nerror = 0.290657\nsamples = 17\nvalue = [ 14. 3.]"] ; 258 | 91 [label="user_id <= 485.0\nerror = 0.336735\nsamples = 14\nvalue = [ 11. 3.]"] ; 259 | 98 [label="error = 0.0\nsamples = 3\nvalue = [ 3. 0.]"] ; 260 | 90 -> 91 ; 261 | 90 -> 98 ; 262 | 91 [label="user_id <= 485.0\nerror = 0.336735\nsamples = 14\nvalue = [ 11. 3.]"] ; 263 | 92 [label="num_likes <= 7.5\nerror = 0.260355\nsamples = 13\nvalue = [ 11. 2.]"] ; 264 | 97 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 265 | 91 -> 92 ; 266 | 91 -> 97 ; 267 | 92 [label="num_likes <= 7.5\nerror = 0.260355\nsamples = 13\nvalue = [ 11. 2.]"] ; 268 | 93 [label="num_likes <= 6.5\nerror = 0.18\nsamples = 10\nvalue = [ 9. 1.]"] ; 269 | 96 [label="error = 0.444444\nsamples = 3\nvalue = [ 2. 1.]"] ; 270 | 92 -> 93 ; 271 | 92 -> 96 ; 272 | 93 [label="num_likes <= 6.5\nerror = 0.18\nsamples = 10\nvalue = [ 9. 1.]"] ; 273 | 94 [label="error = 0.277778\nsamples = 6\nvalue = [ 5. 1.]"] ; 274 | 95 [label="error = 0.0\nsamples = 4\nvalue = [ 4. 0.]"] ; 275 | 93 -> 94 ; 276 | 93 -> 95 ; 277 | 99 [label="num_likes <= 4.5\nerror = 0.48\nsamples = 10\nvalue = [ 6. 4.]"] ; 278 | 100 [label="error = 0.277778\nsamples = 6\nvalue = [ 5. 1.]"] ; 279 | 101 [label="error = 0.375\nsamples = 4\nvalue = [ 1. 3.]"] ; 280 | 99 -> 100 ; 281 | 99 -> 101 ; 282 | 105 [label="user_id <= 692.5\nerror = 0.46281\nsamples = 11\nvalue = [ 7. 4.]"] ; 283 | 106 [label="error = 0.0\nsamples = 3\nvalue = [ 0. 3.]"] ; 284 | 107 [label="error = 0.21875\nsamples = 8\nvalue = [ 7. 1.]"] ; 285 | 105 -> 106 ; 286 | 105 -> 107 ; 287 | 111 [label="user_id <= 906.5\nerror = 0.18\nsamples = 10\nvalue = [ 9. 1.]"] ; 288 | 112 [label="error = 0.0\nsamples = 8\nvalue = [ 8. 0.]"] ; 289 | 113 [label="error = 0.5\nsamples = 2\nvalue = [ 1. 1.]"] ; 290 | 111 -> 112 ; 291 | 111 -> 113 ; 292 | 115 [label="num_likes <= 7.5\nerror = 0.0928019\nsamples = 41\nvalue = [ 39. 2.]"] ; 293 | 116 [label="num_likes <= 4.5\nerror = 0.0555102\nsamples = 35\nvalue = [ 34. 1.]"] ; 294 | 121 [label="error = 0.277778\nsamples = 6\nvalue = [ 5. 1.]"] ; 295 | 115 -> 116 ; 296 | 115 -> 121 ; 297 | 116 [label="num_likes <= 4.5\nerror = 0.0555102\nsamples = 35\nvalue = [ 34. 1.]"] ; 298 | 117 [label="user_id <= 1019.5\nerror = 0.165289\nsamples = 11\nvalue = [ 10. 1.]"] ; 299 | 120 [label="error = 0.0\nsamples = 24\nvalue = [ 24. 0.]"] ; 300 | 116 -> 117 ; 301 | 116 -> 120 ; 302 | 117 [label="user_id <= 1019.5\nerror = 0.165289\nsamples = 11\nvalue = [ 10. 1.]"] ; 303 | 118 [label="error = 0.375\nsamples = 4\nvalue = [ 3. 1.]"] ; 304 | 119 [label="error = 0.0\nsamples = 7\nvalue = [ 7. 0.]"] ; 305 | 117 -> 118 ; 306 | 117 -> 119 ; 307 | 122 [label="user_id <= 341.0\nerror = 0.493283\nsamples = 371\nvalue = [ 207. 164.]"] ; 308 | 123 [label="user_id <= 2.5\nerror = 0.484877\nsamples = 138\nvalue = [ 57. 81.]"] ; 309 | 166 [label="num_likes <= 41.5\nerror = 0.458656\nsamples = 233\nvalue = [ 150. 83.]"] ; 310 | 122 -> 123 ; 311 | 122 -> 166 ; 312 | 123 [label="user_id <= 2.5\nerror = 0.484877\nsamples = 138\nvalue = [ 57. 81.]"] ; 313 | 124 [label="num_likes <= 43.5\nerror = 0.437045\nsamples = 31\nvalue = [ 21. 10.]"] ; 314 | 137 [label="user_id <= 241.0\nerror = 0.446502\nsamples = 107\nvalue = [ 36. 71.]"] ; 315 | 123 -> 124 ; 316 | 123 -> 137 ; 317 | 124 [label="num_likes <= 43.5\nerror = 0.437045\nsamples = 31\nvalue = [ 21. 10.]"] ; 318 | 125 [label="num_likes <= 11.5\nerror = 0.399524\nsamples = 29\nvalue = [ 21. 8.]"] ; 319 | 136 [label="error = 0.0\nsamples = 2\nvalue = [ 0. 2.]"] ; 320 | 124 -> 125 ; 321 | 124 -> 136 ; 322 | 125 [label="num_likes <= 11.5\nerror = 0.399524\nsamples = 29\nvalue = [ 21. 8.]"] ; 323 | 126 [label="num_likes <= 10.5\nerror = 0.277778\nsamples = 12\nvalue = [ 10. 2.]"] ; 324 | 129 [label="num_likes <= 12.5\nerror = 0.456747\nsamples = 17\nvalue = [ 11. 6.]"] ; 325 | 125 -> 126 ; 326 | 125 -> 129 ; 327 | 126 [label="num_likes <= 10.5\nerror = 0.277778\nsamples = 12\nvalue = [ 10. 2.]"] ; 328 | 127 [label="error = 0.345679\nsamples = 9\nvalue = [ 7. 2.]"] ; 329 | 128 [label="error = 0.0\nsamples = 3\nvalue = [ 3. 0.]"] ; 330 | 126 -> 127 ; 331 | 126 -> 128 ; 332 | 129 [label="num_likes <= 12.5\nerror = 0.456747\nsamples = 17\nvalue = [ 11. 6.]"] ; 333 | 130 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 334 | 131 [label="num_likes <= 13.5\nerror = 0.429688\nsamples = 16\nvalue = [ 11. 5.]"] ; 335 | 129 -> 130 ; 336 | 129 -> 131 ; 337 | 131 [label="num_likes <= 13.5\nerror = 0.429688\nsamples = 16\nvalue = [ 11. 5.]"] ; 338 | 132 [label="error = 0.0\nsamples = 2\nvalue = [ 2. 0.]"] ; 339 | 133 [label="num_likes <= 25.0\nerror = 0.459184\nsamples = 14\nvalue = [ 9. 5.]"] ; 340 | 131 -> 132 ; 341 | 131 -> 133 ; 342 | 133 [label="num_likes <= 25.0\nerror = 0.459184\nsamples = 14\nvalue = [ 9. 5.]"] ; 343 | 134 [label="error = 0.5\nsamples = 6\nvalue = [ 3. 3.]"] ; 344 | 135 [label="error = 0.375\nsamples = 8\nvalue = [ 6. 2.]"] ; 345 | 133 -> 134 ; 346 | 133 -> 135 ; 347 | 137 [label="user_id <= 241.0\nerror = 0.446502\nsamples = 107\nvalue = [ 36. 71.]"] ; 348 | 138 [label="user_id <= 50.0\nerror = 0.465974\nsamples = 92\nvalue = [ 34. 58.]"] ; 349 | 163 [label="num_likes <= 16.5\nerror = 0.231111\nsamples = 15\nvalue = [ 2. 13.]"] ; 350 | 137 -> 138 ; 351 | 137 -> 163 ; 352 | 138 [label="user_id <= 50.0\nerror = 0.465974\nsamples = 92\nvalue = [ 34. 58.]"] ; 353 | 139 [label="num_likes <= 28.5\nerror = 0.310651\nsamples = 26\nvalue = [ 5. 21.]"] ; 354 | 144 [label="num_likes <= 28.5\nerror = 0.492654\nsamples = 66\nvalue = [ 29. 37.]"] ; 355 | 138 -> 139 ; 356 | 138 -> 144 ; 357 | 139 [label="num_likes <= 28.5\nerror = 0.310651\nsamples = 26\nvalue = [ 5. 21.]"] ; 358 | 140 [label="num_likes <= 12.5\nerror = 0.197531\nsamples = 18\nvalue = [ 2. 16.]"] ; 359 | 143 [label="error = 0.46875\nsamples = 8\nvalue = [ 3. 5.]"] ; 360 | 139 -> 140 ; 361 | 139 -> 143 ; 362 | 140 [label="num_likes <= 12.5\nerror = 0.197531\nsamples = 18\nvalue = [ 2. 16.]"] ; 363 | 141 [label="error = 0.375\nsamples = 8\nvalue = [ 2. 6.]"] ; 364 | 142 [label="error = 0.0\nsamples = 10\nvalue = [ 0. 10.]"] ; 365 | 140 -> 141 ; 366 | 140 -> 142 ; 367 | 144 [label="num_likes <= 28.5\nerror = 0.492654\nsamples = 66\nvalue = [ 29. 37.]"] ; 368 | 145 [label="num_likes <= 22.5\nerror = 0.49926\nsamples = 52\nvalue = [ 27. 25.]"] ; 369 | 160 [label="num_likes <= 52.0\nerror = 0.244898\nsamples = 14\nvalue = [ 2. 12.]"] ; 370 | 144 -> 145 ; 371 | 144 -> 160 ; 372 | 145 [label="num_likes <= 22.5\nerror = 0.49926\nsamples = 52\nvalue = [ 27. 25.]"] ; 373 | 146 [label="user_id <= 57.0\nerror = 0.496219\nsamples = 46\nvalue = [ 21. 25.]"] ; 374 | 159 [label="error = 0.0\nsamples = 6\nvalue = [ 6. 0.]"] ; 375 | 145 -> 146 ; 376 | 145 -> 159 ; 377 | 146 [label="user_id <= 57.0\nerror = 0.496219\nsamples = 46\nvalue = [ 21. 25.]"] ; 378 | 147 [label="num_likes <= 18.0\nerror = 0.42\nsamples = 10\nvalue = [ 7. 3.]"] ; 379 | 150 [label="user_id <= 121.5\nerror = 0.475309\nsamples = 36\nvalue = [ 14. 22.]"] ; 380 | 146 -> 147 ; 381 | 146 -> 150 ; 382 | 147 [label="num_likes <= 18.0\nerror = 0.42\nsamples = 10\nvalue = [ 7. 3.]"] ; 383 | 148 [label="error = 0.345679\nsamples = 9\nvalue = [ 7. 2.]"] ; 384 | 149 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 385 | 147 -> 148 ; 386 | 147 -> 149 ; 387 | 150 [label="user_id <= 121.5\nerror = 0.475309\nsamples = 36\nvalue = [ 14. 22.]"] ; 388 | 151 [label="num_likes <= 10.5\nerror = 0.290657\nsamples = 17\nvalue = [ 3. 14.]"] ; 389 | 154 [label="user_id <= 192.0\nerror = 0.487535\nsamples = 19\nvalue = [ 11. 8.]"] ; 390 | 150 -> 151 ; 391 | 150 -> 154 ; 392 | 151 [label="num_likes <= 10.5\nerror = 0.290657\nsamples = 17\nvalue = [ 3. 14.]"] ; 393 | 152 [label="error = 0.5\nsamples = 6\nvalue = [ 3. 3.]"] ; 394 | 153 [label="error = 0.0\nsamples = 11\nvalue = [ 0. 11.]"] ; 395 | 151 -> 152 ; 396 | 151 -> 153 ; 397 | 154 [label="user_id <= 192.0\nerror = 0.487535\nsamples = 19\nvalue = [ 11. 8.]"] ; 398 | 155 [label="user_id <= 175.0\nerror = 0.297521\nsamples = 11\nvalue = [ 9. 2.]"] ; 399 | 158 [label="error = 0.375\nsamples = 8\nvalue = [ 2. 6.]"] ; 400 | 154 -> 155 ; 401 | 154 -> 158 ; 402 | 155 [label="user_id <= 175.0\nerror = 0.297521\nsamples = 11\nvalue = [ 9. 2.]"] ; 403 | 156 [label="error = 0.444444\nsamples = 6\nvalue = [ 4. 2.]"] ; 404 | 157 [label="error = 0.0\nsamples = 5\nvalue = [ 5. 0.]"] ; 405 | 155 -> 156 ; 406 | 155 -> 157 ; 407 | 160 [label="num_likes <= 52.0\nerror = 0.244898\nsamples = 14\nvalue = [ 2. 12.]"] ; 408 | 161 [label="error = 0.0\nsamples = 10\nvalue = [ 0. 10.]"] ; 409 | 162 [label="error = 0.5\nsamples = 4\nvalue = [ 2. 2.]"] ; 410 | 160 -> 161 ; 411 | 160 -> 162 ; 412 | 163 [label="num_likes <= 16.5\nerror = 0.231111\nsamples = 15\nvalue = [ 2. 13.]"] ; 413 | 164 [label="error = 0.345679\nsamples = 9\nvalue = [ 2. 7.]"] ; 414 | 165 [label="error = 0.0\nsamples = 6\nvalue = [ 0. 6.]"] ; 415 | 163 -> 164 ; 416 | 163 -> 165 ; 417 | 166 [label="num_likes <= 41.5\nerror = 0.458656\nsamples = 233\nvalue = [ 150. 83.]"] ; 418 | 167 [label="num_likes <= 16.5\nerror = 0.422762\nsamples = 201\nvalue = [ 140. 61.]"] ; 419 | 242 [label="user_id <= 977.5\nerror = 0.429688\nsamples = 32\nvalue = [ 10. 22.]"] ; 420 | 166 -> 167 ; 421 | 166 -> 242 ; 422 | 167 [label="num_likes <= 16.5\nerror = 0.422762\nsamples = 201\nvalue = [ 140. 61.]"] ; 423 | 168 [label="user_id <= 1248.0\nerror = 0.356505\nsamples = 112\nvalue = [ 86. 26.]"] ; 424 | 213 [label="user_id <= 480.0\nerror = 0.477212\nsamples = 89\nvalue = [ 54. 35.]"] ; 425 | 167 -> 168 ; 426 | 167 -> 213 ; 427 | 168 [label="user_id <= 1248.0\nerror = 0.356505\nsamples = 112\nvalue = [ 86. 26.]"] ; 428 | 169 [label="user_id <= 1165.0\nerror = 0.348998\nsamples = 111\nvalue = [ 86. 25.]"] ; 429 | 212 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 430 | 168 -> 169 ; 431 | 168 -> 212 ; 432 | 169 [label="user_id <= 1165.0\nerror = 0.348998\nsamples = 111\nvalue = [ 86. 25.]"] ; 433 | 170 [label="user_id <= 893.0\nerror = 0.3652\nsamples = 104\nvalue = [ 79. 25.]"] ; 434 | 211 [label="error = 0.0\nsamples = 7\nvalue = [ 7. 0.]"] ; 435 | 169 -> 170 ; 436 | 169 -> 211 ; 437 | 170 [label="user_id <= 893.0\nerror = 0.3652\nsamples = 104\nvalue = [ 79. 25.]"] ; 438 | 171 [label="user_id <= 370.5\nerror = 0.328181\nsamples = 87\nvalue = [ 69. 18.]"] ; 439 | 198 [label="user_id <= 924.0\nerror = 0.484429\nsamples = 17\nvalue = [ 10. 7.]"] ; 440 | 170 -> 171 ; 441 | 170 -> 198 ; 442 | 171 [label="user_id <= 370.5\nerror = 0.328181\nsamples = 87\nvalue = [ 69. 18.]"] ; 443 | 172 [label="user_id <= 363.5\nerror = 0.48\nsamples = 10\nvalue = [ 6. 4.]"] ; 444 | 175 [label="user_id <= 447.5\nerror = 0.297521\nsamples = 77\nvalue = [ 63. 14.]"] ; 445 | 171 -> 172 ; 446 | 171 -> 175 ; 447 | 172 [label="user_id <= 363.5\nerror = 0.48\nsamples = 10\nvalue = [ 6. 4.]"] ; 448 | 173 [label="error = 0.0\nsamples = 2\nvalue = [ 2. 0.]"] ; 449 | 174 [label="error = 0.5\nsamples = 8\nvalue = [ 4. 4.]"] ; 450 | 172 -> 173 ; 451 | 172 -> 174 ; 452 | 175 [label="user_id <= 447.5\nerror = 0.297521\nsamples = 77\nvalue = [ 63. 14.]"] ; 453 | 176 [label="num_likes <= 12.5\nerror = 0.124444\nsamples = 15\nvalue = [ 14. 1.]"] ; 454 | 179 [label="user_id <= 542.5\nerror = 0.331426\nsamples = 62\nvalue = [ 49. 13.]"] ; 455 | 175 -> 176 ; 456 | 175 -> 179 ; 457 | 176 [label="num_likes <= 12.5\nerror = 0.124444\nsamples = 15\nvalue = [ 14. 1.]"] ; 458 | 177 [label="error = 0.0\nsamples = 10\nvalue = [ 10. 0.]"] ; 459 | 178 [label="error = 0.32\nsamples = 5\nvalue = [ 4. 1.]"] ; 460 | 176 -> 177 ; 461 | 176 -> 178 ; 462 | 179 [label="user_id <= 542.5\nerror = 0.331426\nsamples = 62\nvalue = [ 49. 13.]"] ; 463 | 180 [label="user_id <= 518.5\nerror = 0.444444\nsamples = 24\nvalue = [ 16. 8.]"] ; 464 | 189 [label="user_id <= 825.0\nerror = 0.228532\nsamples = 38\nvalue = [ 33. 5.]"] ; 465 | 179 -> 180 ; 466 | 179 -> 189 ; 467 | 180 [label="user_id <= 518.5\nerror = 0.444444\nsamples = 24\nvalue = [ 16. 8.]"] ; 468 | 181 [label="num_likes <= 12.5\nerror = 0.42344\nsamples = 23\nvalue = [ 16. 7.]"] ; 469 | 188 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 470 | 180 -> 181 ; 471 | 180 -> 188 ; 472 | 181 [label="num_likes <= 12.5\nerror = 0.42344\nsamples = 23\nvalue = [ 16. 7.]"] ; 473 | 182 [label="user_id <= 490.0\nerror = 0.48\nsamples = 15\nvalue = [ 9. 6.]"] ; 474 | 187 [label="error = 0.21875\nsamples = 8\nvalue = [ 7. 1.]"] ; 475 | 181 -> 182 ; 476 | 181 -> 187 ; 477 | 182 [label="user_id <= 490.0\nerror = 0.48\nsamples = 15\nvalue = [ 9. 6.]"] ; 478 | 183 [label="num_likes <= 10.5\nerror = 0.5\nsamples = 12\nvalue = [ 6. 6.]"] ; 479 | 186 [label="error = 0.0\nsamples = 3\nvalue = [ 3. 0.]"] ; 480 | 182 -> 183 ; 481 | 182 -> 186 ; 482 | 183 [label="num_likes <= 10.5\nerror = 0.5\nsamples = 12\nvalue = [ 6. 6.]"] ; 483 | 184 [label="error = 0.444444\nsamples = 6\nvalue = [ 4. 2.]"] ; 484 | 185 [label="error = 0.444444\nsamples = 6\nvalue = [ 2. 4.]"] ; 485 | 183 -> 184 ; 486 | 183 -> 185 ; 487 | 189 [label="user_id <= 825.0\nerror = 0.228532\nsamples = 38\nvalue = [ 33. 5.]"] ; 488 | 190 [label="num_likes <= 15.5\nerror = 0.137174\nsamples = 27\nvalue = [ 25. 2.]"] ; 489 | 195 [label="user_id <= 857.0\nerror = 0.396694\nsamples = 11\nvalue = [ 8. 3.]"] ; 490 | 189 -> 190 ; 491 | 189 -> 195 ; 492 | 190 [label="num_likes <= 15.5\nerror = 0.137174\nsamples = 27\nvalue = [ 25. 2.]"] ; 493 | 191 [label="num_likes <= 13.5\nerror = 0.0768\nsamples = 25\nvalue = [ 24. 1.]"] ; 494 | 194 [label="error = 0.5\nsamples = 2\nvalue = [ 1. 1.]"] ; 495 | 190 -> 191 ; 496 | 190 -> 194 ; 497 | 191 [label="num_likes <= 13.5\nerror = 0.0768\nsamples = 25\nvalue = [ 24. 1.]"] ; 498 | 192 [label="error = 0.0\nsamples = 17\nvalue = [ 17. 0.]"] ; 499 | 193 [label="error = 0.21875\nsamples = 8\nvalue = [ 7. 1.]"] ; 500 | 191 -> 192 ; 501 | 191 -> 193 ; 502 | 195 [label="user_id <= 857.0\nerror = 0.396694\nsamples = 11\nvalue = [ 8. 3.]"] ; 503 | 196 [label="error = 0.5\nsamples = 6\nvalue = [ 3. 3.]"] ; 504 | 197 [label="error = 0.0\nsamples = 5\nvalue = [ 5. 0.]"] ; 505 | 195 -> 196 ; 506 | 195 -> 197 ; 507 | 198 [label="user_id <= 924.0\nerror = 0.484429\nsamples = 17\nvalue = [ 10. 7.]"] ; 508 | 199 [label="error = 0.0\nsamples = 2\nvalue = [ 0. 2.]"] ; 509 | 200 [label="user_id <= 1132.5\nerror = 0.444444\nsamples = 15\nvalue = [ 10. 5.]"] ; 510 | 198 -> 199 ; 511 | 198 -> 200 ; 512 | 200 [label="user_id <= 1132.5\nerror = 0.444444\nsamples = 15\nvalue = [ 10. 5.]"] ; 513 | 201 [label="num_likes <= 15.5\nerror = 0.408163\nsamples = 14\nvalue = [ 10. 4.]"] ; 514 | 210 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 515 | 200 -> 201 ; 516 | 200 -> 210 ; 517 | 201 [label="num_likes <= 15.5\nerror = 0.408163\nsamples = 14\nvalue = [ 10. 4.]"] ; 518 | 202 [label="user_id <= 1078.0\nerror = 0.444444\nsamples = 12\nvalue = [ 8. 4.]"] ; 519 | 209 [label="error = 0.0\nsamples = 2\nvalue = [ 2. 0.]"] ; 520 | 201 -> 202 ; 521 | 201 -> 209 ; 522 | 202 [label="user_id <= 1078.0\nerror = 0.444444\nsamples = 12\nvalue = [ 8. 4.]"] ; 523 | 203 [label="num_likes <= 14.5\nerror = 0.396694\nsamples = 11\nvalue = [ 8. 3.]"] ; 524 | 208 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 525 | 202 -> 203 ; 526 | 202 -> 208 ; 527 | 203 [label="num_likes <= 14.5\nerror = 0.396694\nsamples = 11\nvalue = [ 8. 3.]"] ; 528 | 204 [label="user_id <= 947.0\nerror = 0.32\nsamples = 10\nvalue = [ 8. 2.]"] ; 529 | 207 [label="error = 0.0\nsamples = 1\nvalue = [ 0. 1.]"] ; 530 | 203 -> 204 ; 531 | 203 -> 207 ; 532 | 204 [label="user_id <= 947.0\nerror = 0.32\nsamples = 10\nvalue = [ 8. 2.]"] ; 533 | 205 [label="error = 0.0\nsamples = 4\nvalue = [ 4. 0.]"] ; 534 | 206 [label="error = 0.444444\nsamples = 6\nvalue = [ 4. 2.]"] ; 535 | 204 -> 205 ; 536 | 204 -> 206 ; 537 | 213 [label="user_id <= 480.0\nerror = 0.477212\nsamples = 89\nvalue = [ 54. 35.]"] ; 538 | 214 [label="num_likes <= 29.0\nerror = 0.35124\nsamples = 22\nvalue = [ 17. 5.]"] ; 539 | 221 [label="user_id <= 538.5\nerror = 0.494542\nsamples = 67\nvalue = [ 37. 30.]"] ; 540 | 213 -> 214 ; 541 | 213 -> 221 ; 542 | 214 [label="num_likes <= 29.0\nerror = 0.35124\nsamples = 22\nvalue = [ 17. 5.]"] ; 543 | 215 [label="num_likes <= 25.5\nerror = 0.277778\nsamples = 18\nvalue = [ 15. 3.]"] ; 544 | 220 [label="error = 0.5\nsamples = 4\nvalue = [ 2. 2.]"] ; 545 | 214 -> 215 ; 546 | 214 -> 220 ; 547 | 215 [label="num_likes <= 25.5\nerror = 0.277778\nsamples = 18\nvalue = [ 15. 3.]"] ; 548 | 216 [label="num_likes <= 20.5\nerror = 0.35503\nsamples = 13\nvalue = [ 10. 3.]"] ; 549 | 219 [label="error = 0.0\nsamples = 5\nvalue = [ 5. 0.]"] ; 550 | 215 -> 216 ; 551 | 215 -> 219 ; 552 | 216 [label="num_likes <= 20.5\nerror = 0.35503\nsamples = 13\nvalue = [ 10. 3.]"] ; 553 | 217 [label="error = 0.21875\nsamples = 8\nvalue = [ 7. 1.]"] ; 554 | 218 [label="error = 0.48\nsamples = 5\nvalue = [ 3. 2.]"] ; 555 | 216 -> 217 ; 556 | 216 -> 218 ; 557 | 221 [label="user_id <= 538.5\nerror = 0.494542\nsamples = 67\nvalue = [ 37. 30.]"] ; 558 | 222 [label="error = 0.0\nsamples = 4\nvalue = [ 0. 4.]"] ; 559 | 223 [label="num_likes <= 33.5\nerror = 0.484757\nsamples = 63\nvalue = [ 37. 26.]"] ; 560 | 221 -> 222 ; 561 | 221 -> 223 ; 562 | 223 [label="num_likes <= 33.5\nerror = 0.484757\nsamples = 63\nvalue = [ 37. 26.]"] ; 563 | 224 [label="user_id <= 1315.0\nerror = 0.495868\nsamples = 55\nvalue = [ 30. 25.]"] ; 564 | 241 [label="error = 0.21875\nsamples = 8\nvalue = [ 7. 1.]"] ; 565 | 223 -> 224 ; 566 | 223 -> 241 ; 567 | 224 [label="user_id <= 1315.0\nerror = 0.495868\nsamples = 55\nvalue = [ 30. 25.]"] ; 568 | 225 [label="user_id <= 1067.5\nerror = 0.49926\nsamples = 52\nvalue = [ 27. 25.]"] ; 569 | 240 [label="error = 0.0\nsamples = 3\nvalue = [ 3. 0.]"] ; 570 | 224 -> 225 ; 571 | 224 -> 240 ; 572 | 225 [label="user_id <= 1067.5\nerror = 0.49926\nsamples = 52\nvalue = [ 27. 25.]"] ; 573 | 226 [label="user_id <= 878.0\nerror = 0.48675\nsamples = 43\nvalue = [ 25. 18.]"] ; 574 | 239 [label="error = 0.345679\nsamples = 9\nvalue = [ 2. 7.]"] ; 575 | 225 -> 226 ; 576 | 225 -> 239 ; 577 | 226 [label="user_id <= 878.0\nerror = 0.48675\nsamples = 43\nvalue = [ 25. 18.]"] ; 578 | 227 [label="user_id <= 800.5\nerror = 0.499541\nsamples = 33\nvalue = [ 17. 16.]"] ; 579 | 236 [label="user_id <= 1030.5\nerror = 0.32\nsamples = 10\nvalue = [ 8. 2.]"] ; 580 | 226 -> 227 ; 581 | 226 -> 236 ; 582 | 227 [label="user_id <= 800.5\nerror = 0.499541\nsamples = 33\nvalue = [ 17. 16.]"] ; 583 | 228 [label="user_id <= 707.0\nerror = 0.465374\nsamples = 19\nvalue = [ 12. 7.]"] ; 584 | 233 [label="num_likes <= 25.5\nerror = 0.459184\nsamples = 14\nvalue = [ 5. 9.]"] ; 585 | 227 -> 228 ; 586 | 227 -> 233 ; 587 | 228 [label="user_id <= 707.0\nerror = 0.465374\nsamples = 19\nvalue = [ 12. 7.]"] ; 588 | 229 [label="num_likes <= 25.5\nerror = 0.5\nsamples = 12\nvalue = [ 6. 6.]"] ; 589 | 232 [label="error = 0.244898\nsamples = 7\nvalue = [ 6. 1.]"] ; 590 | 228 -> 229 ; 591 | 228 -> 232 ; 592 | 229 [label="num_likes <= 25.5\nerror = 0.5\nsamples = 12\nvalue = [ 6. 6.]"] ; 593 | 230 [label="error = 0.408163\nsamples = 7\nvalue = [ 5. 2.]"] ; 594 | 231 [label="error = 0.32\nsamples = 5\nvalue = [ 1. 4.]"] ; 595 | 229 -> 230 ; 596 | 229 -> 231 ; 597 | 233 [label="num_likes <= 25.5\nerror = 0.459184\nsamples = 14\nvalue = [ 5. 9.]"] ; 598 | 234 [label="error = 0.21875\nsamples = 8\nvalue = [ 1. 7.]"] ; 599 | 235 [label="error = 0.444444\nsamples = 6\nvalue = [ 4. 2.]"] ; 600 | 233 -> 234 ; 601 | 233 -> 235 ; 602 | 236 [label="user_id <= 1030.5\nerror = 0.32\nsamples = 10\nvalue = [ 8. 2.]"] ; 603 | 237 [label="error = 0.21875\nsamples = 8\nvalue = [ 7. 1.]"] ; 604 | 238 [label="error = 0.5\nsamples = 2\nvalue = [ 1. 1.]"] ; 605 | 236 -> 237 ; 606 | 236 -> 238 ; 607 | 242 [label="user_id <= 977.5\nerror = 0.429688\nsamples = 32\nvalue = [ 10. 22.]"] ; 608 | 243 [label="num_likes <= 50.5\nerror = 0.366231\nsamples = 29\nvalue = [ 7. 22.]"] ; 609 | 250 [label="error = 0.0\nsamples = 3\nvalue = [ 3. 0.]"] ; 610 | 242 -> 243 ; 611 | 242 -> 250 ; 612 | 243 [label="num_likes <= 50.5\nerror = 0.366231\nsamples = 29\nvalue = [ 7. 22.]"] ; 613 | 244 [label="num_likes <= 47.5\nerror = 0.486111\nsamples = 12\nvalue = [ 5. 7.]"] ; 614 | 247 [label="user_id <= 904.0\nerror = 0.207612\nsamples = 17\nvalue = [ 2. 15.]"] ; 615 | 243 -> 244 ; 616 | 243 -> 247 ; 617 | 244 [label="num_likes <= 47.5\nerror = 0.486111\nsamples = 12\nvalue = [ 5. 7.]"] ; 618 | 245 [label="error = 0.21875\nsamples = 8\nvalue = [ 1. 7.]"] ; 619 | 246 [label="error = 0.0\nsamples = 4\nvalue = [ 4. 0.]"] ; 620 | 244 -> 245 ; 621 | 244 -> 246 ; 622 | 247 [label="user_id <= 904.0\nerror = 0.207612\nsamples = 17\nvalue = [ 2. 15.]"] ; 623 | 248 [label="error = 0.0\nsamples = 13\nvalue = [ 0. 13.]"] ; 624 | 249 [label="error = 0.5\nsamples = 4\nvalue = [ 2. 2.]"] ; 625 | 247 -> 248 ; 626 | 247 -> 249 ; 627 | } -------------------------------------------------------------------------------- /solving_problems/thingiverse_tree.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hmason/ml_class/33af094dffeb48ae6c4ce18cebb5162c69a2e35f/solving_problems/thingiverse_tree.png --------------------------------------------------------------------------------