User:Renepick/todos

/Mooc Scripte

This page is supposed to be extended to a full structure and overview of the later parts of the web science mooc:

taken from:

also interesting is our first version of part2:

Super structure * Lead question: What structures constrain people to form the Web * Dominating disciplines: mathematics and physics * Processes on links could be part of behavioral models??? * Lead question: Why do people behave like they do when creating the Web? * Dominating disciplines: sociology, economics * Exploitation: crowd sourcing, personalization,... * Lead question: Wieso ist Suche im Web so erfolgreich
 * 1) Part 1 - HOW?: Web technology
 * 2) Lead question: How does the Web work from a technical perspective?
 * 3) Part 2 - WHAT?: Web models (Units 1-4): Structure (Content and Nodes and Links) and Processes (Content Formation and Link Formation and Processes on Links)
 * 1) Part 3 - WHY?: Behavioral models (Units: Social capital, Decision making (independent vs dependent (herding)), Payment models (Advertisement) ) and their exploitation
 * 1) Part 4 - Web & Society - Constraining the Web (copyright & privacy)
 * 2) Part 5 - Integrated view: search (Units: Search & Personalization)

=lesson|Modelling the Static Web: The Web as a Graph=
 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Understand the notion of a `model'
 * 2) Understand basic properties of graphs
 * 3) Understand the representation of Web link structure as a graph
 * 4) Understand how to represent a graph as adjacency lists or matrices

unit|Basic Graph properties

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Know its constituents: Nodes, Edges, Labels
 * 2) Know operations on constituents: Following edges, counting edges, etc.

unit|Modeling Web link structure as a graph

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Nodes = Web pages
 * 2) Edges = HTML links
 * 3) Labels of nodes = URL (most frequent) XOR HTML text XOR nothing
 * 4) Labels of edges = nothing XOR anchor text

unit|Matrix representation of Graphs

 * learningGoals=
 * 1) Every graph can be represented as a matrix.
 * 2) Be able to translate graphs to matrices and matrices to graphs
 * 3) How to implement basic graph operations (edge counting, edge following,...) using linear algebra
 * 4) Further reading: how singular value decomposition is a linear algebra operation that implements basic graph operations applied infinitely many times

unit|What is a `model'?

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Understand that a model abstracts from reality
 * 2) Understand that a model focuses on few aspects of reality
 * 3) Understand that a model facilitates the analysis of reality
 * 4) Be prepared to encounter many more models in the following weeks

=lesson|Modelling Static Content on the Web and the Dynamics of Content Generation=
 * learningGoals=
 * 1) How to use bag of words to describe a document
 * 2) How to represent bag of words in linear algebra
 * 3) How to realize distance/similarity of documents by metrics on vector space
 * 4) Understand the keyword query as a document
 * 5) Understand that not every word has the same importance (what are stopwords? e.g. `the', ``)

unit|Vector Spaces as a tool for modelling

 * learningGoals=
 * 1) choosing a base.
 * 2) dimension.
 * 3) metrics like euler and cosine distance.

unit|Matrix representation of Metrics

 * learningGoals=
 * 1) being able to interpret the matrix representation
 * 2) be able to represent a document of words in a vector space of words
 * 3) apply singular value decomposition

unit|Introduce TF scores
It might be more natural for the learner to only introduce TF here and leave IDF for later (collective intelligence I), together with page rank


 * learningGoals=
 * 1) understanding the problems with absolute values
 * 2) be able to express the ideas of TF-IDF
 * 3) be able to interpret a TF score and and IDF score independently

unit|Urn Models

 * learningGoals=
 * 1) TBA

unit|Simon Model

 * learningGoals=
 * 1) TBA

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

=lesson|Web Graph Formation: Modelling the Micro Dynamics of the Web Graph to Explain Macro Effects=
 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) know some basic models for generating graphs
 * 2) know about basic aggregated graph properties like degree distribution, graph diameter
 * 3) understand that Web graph structure is not arbitrary
 * 4) understand that not all graph generators create graphs with Web-like graph properties
 * 5) understand that micro-models with explanatory power create graphs with more Web-like graph properties

unit|Basic Graph properties

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Node Degree
 * 2) in and out degree distribution
 * 3) Power law distribution
 * 4) eigenvalues and eigenvectors ? (this could be moved to a more advanced section)
 * 5) measuring the distance between graphs

unit|Erdős–Rényi Model

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Barabási Albert Model

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Preferential attachment

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

=lesson|Processes on the Web Graph: The Example of Modelling the Dynamics of Meme Spreading=
 * video=File:Web science mooc flipped class room spreading memes.webm
 * learningGoals=
 * 1) understand that network structure determines processes, such as individual communication
 * 2) understand that the network structure determines global communication results
 * 3) understand how to model micro-behavior of individuals at large
 * 4) understand how to related dying and exploding memes to the same model
 * 5) understand the difference of perspectives between micro interactions and macro effects
 * 6) Know http://www.nature.com/srep/2012/120329/srep00335/full/srep00335.html
 * 7) Know about effective distance http://link.springer.com/article/10.1140%2Fepjb%2Fe2011-20208-9 http://rocs.hu-berlin.de/D3/ebola/


 * furtherReading=
 * 1) understanding http://www.nature.com/srep/2012/120329/srep00335/full/srep00335.html

unit|Overview of the phenomenon

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Experimental Setup and Methodology of the Memes spreading Model

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Mathematical foundations of the Memes spreading Model

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Results of the Memes spreading Model

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

=lesson|Reflections on Modelling the Web and its Dynamics=
 * video=File:File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Understand different types of models and what to use them for: descriptive, predictive, normative
 * 2) Be able to reflect on the process of modelling

unit|Theory of Social Capital

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) understand that the theory of social capital is just one theoretical framework and there exist others
 * 2) learn about reputation, weak reciprocity, strong reciprocity (cf. talk of Andreas Diekmann at WinterCSS)
 * 3) Be able to name and identify the three dimensions of social capital
 * 4) others?

unit|Randomness vs Regularity

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Distinguishing descriptive vs predictive models

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|the process of modelling

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * File:ComparingModels.png
 * b
 * c

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

=lesson|More Micro Behavior and Macro Effect I: Collective Intelligence=


 * learningGoals=
 * 1) Know about examples of collective intelligence in the Web (and beyond)
 * 2) Understand that clever aggregation of randomly noisy sensor output leads to high quality measurements
 * 3) Understand that independence of judgement is key to high quality collective decision making
 * 4) Relate this to law of large numbers
 * 5) Understand the idea of a social sensor: Model people output as sensor output
 * 6) Understand the idea of recursive aggregation of reputation
 * 7) Understand limitations of when collective intelligence cannot be derived

unit|IDF as Simple Form of Collective Intelligence

 * learningGoals=
 * 1) IDF aggregates common usage of vocabulary
 * 2) knowledge about common usage of vocabulary models term specificity

unit|In-degree as Form of Collective Intelligence

 * learningGoals=
 * 1) IDF aggregates common usage of vocabulary
 * 2) knowledge about common usage of vocabulary models term specificity

unit|Random surfer Model

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Page rank of Graph/Matrix

 * learningGoals=
 * 1) Eigenvalues are an important metric to describe graphs.
 * 2) Decomposing large matrices is computationally heavy.
 * 3) relation to the random surfer model

=lesson|More Micro Behavior and Macro Effect II: Herding=
 * furtherReading=
 * 1) understand: https://www.princeton.edu/~mjs3/salganik_dodds_watts06_full.pdf
 * 2) know some basics about: Herd behavior from the field of psychology
 * Absolute_difference
 * Randomized_experiment
 * Randomized_controlled_trial
 * Randomization
 * Web-based_experiments
 * Conditional_independence
 * Independence_(probability_theory)
 * Dependent_and_independent_variables
 * Herd_behavior
 * Systematic_error


 * learningGoals=
 * 1) Know and understand the notion of herding and swarms
 * 2) Know and understand that local information and positive feedback cycles may destroy collective intelligence (e.g. Groupthink, shitstorms, Klaas' tagging experiments, stock exchange.....)
 * 3) Know about examples of herding, such as preferential attachment, music experiment,...
 * 4) Understand how herding can be measured in an experiment
 * 5) How to conduct a web based experiment with a control group?
 * 6) Get to know one specific experiment and methodology that demonstrated herd behavior on the web.
 * 7) Understand how to empirically design an experiment that can demonstrate herd behavior.
 * 8) Discussing systematic errors in experiments
 * 9) Understand that it is non trivial to verify phenomenons of herding.
 * 10) understand: https://www.princeton.edu/~mjs3/salganik_dodds_watts06_full.pdf
 * video=File:Web science mooc recommendations.webm
 * video=File:Web science mooc recommendations.webm

unit|Research question of herd behavior, inequality and unpredictability of cultural markets

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) What are the research questions that will be answered in the experiment
 * 2) understand that a good study starts with a research question
 * 3) The concept of falsifiability.
 * 4) Good research questions often start with an obervation (e.g.: experts have frequently failed to predict the success of musicians)
 * furtherReading=
 * 1) Falsifiability
 * 2) Design_of_experiments
 * 3) Research_question
 * 4) Experiment

unit|Experimental Setup and data collection process

 * learningGoals=
 * 1) difference between the dependent and independent group
 * 2) what is scientific control
 * 3) Repetition of the experiment (Why do the authors have 8 worlds?) to to conduct a randomized experiment.
 * furtherReading=
 * 1) Dependent_and_independent_variables
 * 2) Independence_(probability_theory)
 * 3) Treatment_and_control_groups
 * 4) Randomized_experiment
 * 5) Randomized_controlled_trial

unit|Discussion of Systematic errors

 * learningGoals=
 * 1) Critical discussion of the web limitations that are posed in the paper. (web scientists can get rid of some of these mistakes)
 * 2) Understand that systematic errors are part of many experiments.
 * 3) Learn to discuss systematic errors of a paper.
 * 4) which measures have been taken to minimize the amount of systematic errors (e.g. introducing 8 worlds)
 * furtherReading=
 * 1) Systematic error
 * 2) Web based experiments

unit|Metrics and their mapping to the research questions

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) a measure for inqueality: the gini coefficient
 * 2) unpredictability needs the 8 worlds to see how different rankings are
 * 3) market share
 * furtherReading=
 * Gini_coefficient
 * Mean_difference for unpredictability

unit|Results of the Music Recommendation hearding experiments

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) we can observe clear hearding behavior.
 * 2) the way conent is presented on the web has an impact of how people consume it.
 * c

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) music experiments are just one empirical indicator for hearding behavior
 * 2) other behavior might night another scientific methodology to identify the behavior.
 * 3) Dellschaft shows that herding may reduce quality of information categorization
 * 4) More on herding: link to t.co/KIgxYSdvCw

=lesson|Harnessing people behavior I: From recommendation technologies to personalization, filter bubbles and price discrimination=
 * learningGoals=
 * 1) understand technology of ranking and recommendation algorithms
 * 2) understand the concept of relevance
 * 3) understand antagonism between personalization and filter bubble
 * 4) understand antagonism between personalization and price discrimination
 * 5) know about different forms of price discrimination (higher prices for same product vs. offers of more higher-priced products)
 * 6) understand the aspect of price discrimination

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

=lesson|Harnessing people behavior II: Crowdsourcing=
 * learningGoals=
 * 1) understand terminology: crowdsourcing, human computation, collective intelligence
 * 2) know models: crowdfunding, wikipedia, ...
 * 3) know about incentivization
 * 4) know about astroturfing, crowdturfing and distinguish from social bots

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

=lesson|Advertisement Ecosystems=
 * learningGoals=
 * 1) Understand how cross-site advertisement providers function on the Web
 * 2) Understand advertisement KPIs
 * 3) Relate to recommendations

unit|Introduction to Online Advertisement

 * furtherReading=
 * 1) Online_advertising
 * 2) http://www.rene-pickhardt.de/retargeting-smart-online-marketing-system-by-criteo/
 * 3) http://www.iab.net/media/file/IAB_Internet_Advertising_Revenue_Report_FY_2013.pdf and http://www.iab.net/research/industry_data_and_landscape/adrevenuereport
 * learningGoals=
 * 1) understand the interests of the 4 players (publisher (content owner), advertiser (some brand), ad-service, consumer)
 * 2) be aware of the online ad market and be able to relate it to other ad markets
 * 3) be aware of advertising formats
 * 4) be aware of payment formats for online advertisement
 * 5) test edit
 * video=File:Introduction_to_Online_Advertisement.webm

unit|Metrics for (online) advertisement

 * furtherReading=
 * 1) http://tlvmedia.com/pdf/CPM_CPC_CPA_dCPM.pdf
 * 2) Cost_per_mille
 * 3) Click-through_rate
 * 4) Pay_per_click
 * 5) Affiliate_marketing and Cost_per_acquisition
 * 6) Bounce_rate
 * 7) Conversion_rate
 * learningGoals=
 * 1) be able to list basic metrics of online advertisement (CPC, CTR, CR, BR, CPM) and calculate them
 * 2) be able to interpret the metrics.
 * 3) understand which player should optimize which metric
 * video=File:Under_construction_icon-blue.svg

unit|Factors that have impact on advertisement campaigns

 * furtherReading=
 * 1) Conversion_optimization
 * 2) Landing_page_optimization
 * 3) Bait-and-switch
 * 4) Frequency_capping
 * 5) Lead_scoring
 * 6) Targeted_advertising
 * 7) Negative_keyword (very intersting, it shows the amount of data Google has due to ad products)
 * 8) Online_advertising
 * 9) Behavioral_targeting
 * 10) Contextual_advertising
 * learningGoals=
 * 1) Relevance
 * 2) Targeting (which is a form of relevance)
 * 3) User Context
 * 4) Truthfulness of the add
 * 5) design of the landing page (usability)
 * 6) test
 * video=File:Under_construction_icon-blue.svg

unit|Finding the true value of an advertisement

 * furtherReading=
 * 1) Second_price_auction
 * 2) Auction_theory
 * 3) Game_theory
 * 4) Nash_equilibrium
 * 5) Generalized_second-price_auction
 * 6) original literature: paper and slides
 * learningGoals=
 * 1) Second price auctions
 * 2) Collective intelligence
 * video=File:Under_construction_icon-blue.svg

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

=lesson|Web Search Architecture=
 * learningGoals=
 * 1) Get a technical feeling of the components of a search engine

unit|Components of a Search Engine

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Be able to state the components of a search engine together with their purpose
 * 2) Understand that a search engine is not just a simple box on a simple web site

unit|Web Crawler

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) understand the basic concepts of a web crawler
 * 2) be able to define what nice crawling means (respecting robots.txt, time between requests to the same domain)
 * 3) be able to understand problems with crawling like (dublicate content, broken html, javascript, big data processing, when to update a page)

unit|Search Index

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) be able to name the two important data structures to create an inverted index (linked list, tree).
 * 2) be able to draw schematic pictures of the search index.
 * 3) be aware that it will be necessary and possible to compress and distribute the index.

unit|Ranking System

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Be able that ranking factors have to be choosen by humans by programming time
 * 2) Be able to name at least two used ranking factores in search engines (page rank and TF-IDF)
 * 3) Know at least one possible method to combine various ranking schemes. (linear combination)
 * 4) understand that the inverted index is ordered according to the ranking scheme

unit|User Interface of a Search Engine

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Search box
 * 2) auto completion
 * 3) did you mean (spelling correction)

unit|Query Processing

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) how is a search query evaluated?
 * c
 * c

unit|discussing the Quality of web search engines

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) discuss concepts like false positives, true negatives,...
 * 2) understand that the search engine has pretty little idea about semantics

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) be able to draw the architecture of a simple search engine
 * 2) understand that this did not consist of distributed systems nor of asynchronous parallel programming techniques
 * 3) understand that search is an inherent difficult and unsolved problem from a technical as well as semantic point of view
 * 4) understand that web search is crucial for the ecosystem of the web.

=lesson|Web Search Ecosystem=
 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) be able to name the players and relevant metrics for the online advertisement market.
 * 2) be able to explain why online advertisement is an excellent business model for search engines.
 * 3) understand the threads to companies that have online advertisement as a business model.

unit|The Business Model of Search Engines

 * furtherReading=
 * 1) http://bmimatters.com/2012/03/29/understanding-google-business-model/
 * 2) Overture_Services
 * 3) Google_AdWords
 * 4) http://news.cnet.com/Google,+Yahoo!+bury+the+legal+hatchet/2100-1024_3-5302421.html
 * 5) Performance-based_advertising
 * 6) Web_search
 * learningGoals=
 * 1) understand that search engines are a man in the middle (for search as well as for advertisement)
 * 2) search engines have to work on relevance (in search and for advertisement)
 * 3) Advertisement has the highest click trough rates in search since people have an actual information need and are willing to click.
 * video=File:Under_construction_icon-blue.svg

unit|Understanding the Problems with Click Fraud

 * furtherReading=
 * 1) Click_fraud
 * 2) Click_farm
 * learningGoals=
 * 1) Understand reasons why people would produce click fraud
 * 2) Understand to whom click fraud is harmful.
 * video=File:Under_construction_icon-blue.svg

unit|Understanding the problems with Web SPAM

 * furtherReading=
 * 1) Link_farm
 * 2) Content_farm
 * 3) Spamdexing
 * 4) Spam_in_blogs
 * 5) Adversarial_information_retrieval gives a good overview and is also a research topic
 * 6) Spamming gives a more general overview not only related to search engines
 * learningGoals=# Manipulation of pagerank
 * 1) not relevant to the user
 * 2) as landing pages with ads this is part of click fraud
 * video=File:Under_construction_icon-blue.svg

unit|Summary, Further readings, Homework

 * furtherReading=
 * 1) Your task is to design an experiment with which you could decide which of two landing pages A or B are more suited to make a music fan download a piece of music.
 * 2) Take the following data of users and the truth value they assume an object has. Write a program that implements a simple first bid and second bid auction model. Now simulate user behavior and bidding not according to their assumption of the "true value". Verify empirically that strategic bidding can maximize the outcome in first bid auctions but bidding the true value will be the dominant strategy in a second price auction.
 * learningGoals=
 * a
 * b
 * c
 * video=File:Under_construction_icon-blue.svg

=lesson|Online Communities=
 * learningGoals=

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

=lesson|Business Models for the Internet(Maybe)=
 * learningGoals=

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

=lesson|When Politicians Talk =
 * learningGoals=
 * 1) understanding: http://markusstrohmaier.info/documents/2014_icwsm2014_politicians.pdf
 * 2) understand that it is possible to find computational metrics to quantify social concepts
 * 3) understand that there is a choice in computational metrics and that the choice has to be well motivated.

unit|Research Questions and introduction to the Problem

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Understand the problem setting
 * 2) Be able to state the research questions

unit|Experimental Setup and data sets

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Understand the features from Twitter like (Follow, retweet, hashtag, @mention)
 * 2) understand how the data set is comprised
 * 3) understand the two big offline events (tv duell and election that took place)

unit|Cultural Focus and Similarity

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) understand the social concepts of cultural focus and cultural similarity
 * 2) Be able to explain why Shannon Entropy is a good measure for cultural focus
 * 3) Be able to explain why Cosine similarity is a good measure for cultural similarity
 * 4) Be able to interpret the results for the two metrics from the diagrams.

unit|Styles Institutions and Reproduction

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) understand the social concepts of Istitutions and Reproduction
 * 2) Get to know Rank Biased Overlap as a metric and be able to explain why it is a good fit to measure cultural reproduction
 * 3) Get to know the Hirsch Index as a metric and be able to explain why it is a good fit for Institutionness
 * 4) Be able to interpret the results for the two metrics from the diagrams.

unit|Punctuations (Burstiness)

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Understand the social concept of Punctuation
 * 2) Get to know Kleinbergs Burst Weight as a metric and be able to explain why it is a good fit to measure Punctuation
 * 3) Be able to interpret the results for the metrics from the diagrams.

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

=lesson|Modelling the World Wide web=
 * video=File:Under_construction_icon-blue.svg
 * learningGoals=

unit|What Web Properties do we want to model?

 * learningGoals=
 * 1) Modelling Web Content
 * 2) Modelling Web Pages with hyperlinks
 * 3) Modelling User interests
 * 4) Linear Algebra can be one (of many!) toolkits for modelling.
 * 5) Even while using linear algebra to model the same thing there are several options for modelling coming to different conclusions

unit|Modelling with Matrices

 * learningGoals=
 * 1) chossing a base to represent something as a matrix.
 * 2) Adjacency Matrix of a graph (via using nodes as a basis for a vector space)
 * 3) TF-DF Matrix of a collection of documents (using terms and documents as two basis for vector spaces and the matrix as a linear map)
 * 4) User-Interest Matrix of a Webshop (using users and products as two basis for vector spaces and the matrix as a linear map)
 * 5) Find other matrices for the following settings: Social Network, Mobile Apps, User Trails

unit|Interpreting Basic matrix operations

 * learningGoals=
 * 1) Multiplying with a unit vector from right or left of a matrix to do counting
 * 2) Node in and out degree, how often does a term occur, how many terms are in a document, how many people like a product, how many products likes a person
 * 3) Multiplying with a base vector for extracting a column or a row
 * 4) Transposing a matrix and its interpretation for our examples.

unit|Interpreting Metrics and Skalar products

 * learningGoals=
 * 1) A Metric can be used to indicate similarity
 * 2) Some matrices can be interpreted as a metrics and induce a similarity measure.
 * 3) Scalar products from matrices.
 * 4) Depending on the model of the web we get a different notion of similarity (using cosine similarity for the adjacency matrix of web pages or using cosine similarity of their documents)

unit|Interpreting fundamental properties of Matrices

 * learningGoals=
 * 1) base change
 * 2) diagonalization (Singular value decomposition) (special kind of base that is chosen)
 * 3) interpretation of an eigenvector for the adjacency matrix
 * 4) interpretation of an eigenvectors for the TF-IDF Matrix
 * 5) Using rank reduction as a link predictor (again another way to express similarity)

=lesson|Modelling the Static Web: The Web as a Graph=
 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Understand the notion of a `model'
 * 2) Understand basic properties of graphs
 * 3) Understand the representation of Web link structure as a graph
 * 4) Understand how to represent a graph as adjacency lists or matrices

unit|Basic Graph properties

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Know its constituents: Nodes, Edges, Labels
 * 2) Know operations on constituents: Following edges, counting edges, etc.

unit|Modeling Web link structure as a graph

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Nodes = Web pages
 * 2) Edges = HTML links
 * 3) Labels of nodes = URL (most frequent) XOR HTML text XOR nothing
 * 4) Labels of edges = nothing XOR anchor text

unit|Matrix representation of Graphs

 * learningGoals=
 * 1) Every graph can be represented as a matrix.
 * 2) Be able to translate graphs to matrices and matrices to graphs
 * 3) How to implement basic graph operations (edge counting, edge following,...) using linear algebra
 * 4) Further reading: how singular value decomposition is a linear algebra operation that implements basic graph operations applied infinitely many times

unit|What is a `model'?

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Understand that a model abstracts from reality
 * 2) Understand that a model focuses on few aspects of reality
 * 3) Understand that a model facilitates the analysis of reality
 * 4) Be prepared to encounter many more models in the following weeks

=lesson|Modelling Static Content on the Web and the Dynamics of Content Generation=
 * learningGoals=
 * 1) How to use bag of words to describe a document
 * 2) How to represent bag of words in linear algebra
 * 3) How to realize distance/similarity of documents by metrics on vector space
 * 4) Understand the keyword query as a document
 * 5) Understand that not every word has the same importance (what are stopwords? e.g. `the', ``)

unit|Vector Spaces as a tool for modelling

 * learningGoals=
 * 1) choosing a base.
 * 2) dimension.
 * 3) metrics like euler and cosine distance.

unit|Matrix representation of Metrics

 * learningGoals=
 * 1) being able to interpret the matrix representation
 * 2) be able to represent a document of words in a vector space of words
 * 3) apply singular value decomposition

unit|Introduce TF scores
It might be more natural for the learner to only introduce TF here and leave IDF for later (collective intelligence I), together with page rank


 * learningGoals=
 * 1) understanding the problems with absolute values
 * 2) be able to express the ideas of TF-IDF
 * 3) be able to interpret a TF score and and IDF score independently

unit|Urn Models

 * learningGoals=
 * 1) TBA

unit|Simon Model

 * learningGoals=
 * 1) TBA

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

=lesson|Modelling the Dynamics of the Web Graph: From Micro Interactions to Macro Effects=
 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) know some basic models for generating graphs
 * 2) know about basic aggregated graph properties like degree distribution, graph diameter
 * 3) understand that Web graph structure is not arbitrary
 * 4) understand that not all graph generators create graphs with Web-like graph properties
 * 5) understand that micro-models with explanatory power create graphs with more Web-like graph properties

unit|Basic Graph properties

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * 1) Node Degree
 * 2) in and out degree distribution
 * 3) Power law distribution
 * 4) eigenvalues and eigenvectors ? (this could be moved to a more advanced section)
 * 5) measuring the distance between graphs

unit|Erdős–Rényi Model

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Barabási Albert Model

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Preferential attachment

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c

unit|Summary, Further readings, Homework

 * video=File:Under_construction_icon-blue.svg
 * learningGoals=
 * a
 * b
 * c