Exploiting largescale data is a central challenge of our time. Machine Learning is the core discipline to address this challenge, aiming to extract useful models and structure from data. Studying Machine Learning is motivated in multiple ways: 1) as the basis of commercial data mining (Google, Amazon, Picasa, etc.), 2) a core methodological tool for data analysis in all sciences (vision, linguistics, software engineering, but also biology, physics, neuroscience, etc. ) and finally, 3) as a core foundation of autonomous intelligent systems. In this seminar students will present seminal papers from the area of Machine Learning. Background in Machine Learning, e.g. from the Machine Learning course, is necessary.
This advanced seminar will be held completely in English. INFOTEC, cybernetics and other master students are welcome.
Participants have to give a presentation and write a summary paper.
Presentation
 20 min presentation of the paper
 10 min Q&A
 The other students should be able to grasp the paper afterwards!
 The other students will give you feedback.

DATES: 15th, 22nd and 29th of January 2014 (check table below).
Summary paper
 Do not plagiarize! Writing a summary paper means that your describe, in your own words, the paper’s motivation, contributions, limitations and relations to other work. When refering to the author’s work, say “the authors propose…” or “they developed…”.
 Summary papers must be written in the style of ICML (Int. Conf. on Machine Learning) using their style files (preferrably LaTex). Find these style files online.
 The bibliography should follow scientific standards, preferrably using BibTeX as described in the ICML style.
 total of ~3500 words with the following content
 Motivation and problem: What was the authors’ motivation for this research. What is the problem they are trying to solve.
 Stateoftheart and contributions: What was the stateoftheart BEFORE this paper and what do the authors aim and claim to contribute to the stateoftheart with this work.
 Summarize the methods, techniques, theory, algorithms, etc, that they develop.
 Summarize their evaluation results.
 Research and discuss the impact that this paper had on later research (e.g. use Google Scholar to find citations of this paper).
 Add a personal assessment of the paper including critique and suggestions for improvements.

DEADLINE: 26th of February 2014.
Date  Speaker  Selected Paper 

2014.01.15  Vincke J.  “A View of the EM algorithm that justifies incremental, sparse and other variants” 
2014.01.15  Scheuefele K.  “Active Learning for Parameter Estimation in Bayesian Networks” 
2014.01.15  Mehlbeer F.  “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data” 
2014.01.15  Fuchs S.  “Active Learning with Statistical Models” 
2014.01.22  Hirschmann S.  “Discovering Hidden Variables: A StructureBased Approach” 
2014.01.22  Fleischer L.  “Identifying Hierarchical Structure in Sequences: A LinearTime Algorithm” 
2014.01.22  Hamann M.  “Graphical Models: Structure Learning” 
2014.01.29  Fontanarosa R.  “The infinite Markov Model” 
2014.01.29  Rupp T.  “Knows what it knows: A Framework for SelfAware Learning” 
2014.01.29  Ziegenhagel A.  “Support Vector Machine Learning for Interdependent and Structured Output Spaces” 
Papers
<
p>
 J. Weston, F. Ratle, and R. Collobert: Deep learning via semisupervised embedding. In Proc.\ of the 25th int.\ conf.\ on machine learning (icml 2008), 2008. [Bibtex]
@InProceedings{ weston:08, author = "J. Weston and F. Ratle and R. Collobert", title = "Deep Learning via SemiSupervised Embedding", booktitle = "Proc.\ of the 25th Int.\ Conf.\ on Machine Learning (ICML 2008)", year = "2008", pdf={http://publications.idiap.ch/downloads/papers/2012/Weston_SPRINGER_2012.pdf} }
 I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun: Support vector machine learning for interdependent and structured output spaces. In Proceedings of the twentyfirst international conference on machine learning, 104–, ACM, 2004. [Bibtex]
@inproceedings{Tsochantaridis:2004:SVM:1015330.1015341, author = {Tsochantaridis, Ioannis and Hofmann, Thomas and Joachims, Thorsten and Altun, Yasemin}, title = {Support vector machine learning for interdependent and structured output spaces}, booktitle = {Proceedings of the twentyfirst international conference on Machine learning}, series = {ICML '04}, year = {2004}, isbn = {1581138385}, location = {Banff, Alberta, Canada}, pages = {104}, url = {http://doi.acm.org/10.1145/1015330.1015341}, doi = {10.1145/1015330.1015341}, acmid = {1015341}, publisher = {ACM}, address = {New York, NY, USA}, pdf = {http://dl.acm.org/ft_gateway.cfm?id=1015341&ftid=273123&dwn=1&CFID=370986592&CFTOKEN=39715685} }
 I. Tsochantaridis, T. Joachims, T.Hofmann, and Y.Altun: (long version) large margin methods for structured and interdependent output variables. Journal of machine learning research, 6, 14531484, MIT Press, 2005. [Bibtex]
@Article{ tsochantaridis:05, author = "I. Tsochantaridis and T. Joachims and T.Hofmann and Y.Altun", title = "(LONG VERSION) Large margin methods for structured and interdependent output variables", journal = "Journal of Machine Learning Research", volume = "6", pages = "14531484", year = "2005", publisher = "MIT Press", pdf={http://machinelearning.wustl.edu/mlpapers/paper_files/TsochantaridisJHA05.pdf} }
 S. Tong and D. Koller: Active learning for parameter estimation in bayesian networks. In In advances in neural information processing systems (nips 2000), 2001. [Bibtex]
@InProceedings{ tongkoller:01, title = "Active learning for parameter estimation in Bayesian networks", author = "S. Tong and D. Koller", booktitle = "In Advances in Neural Information Processing Systems (NIPS 2000)", optaddress = "Denver, Colorado", year = "2001", pdf={http://ai.stanford.edu/~koller/Papers/Tong+Koller:NIPS00.pdf} }
 B. Taskar, C. Guestrin, and D. Koller: Maxmargin markov networks. In Advances in neural information processing systems (nips 2003), 16, MIT Press, 2004. [Bibtex]
@InCollection{ taskar:04, author = "Ben Taskar and Carlos Guestrin and Daphne Koller", title = "MaxMargin Markov Networks", booktitle = "Advances in Neural Information Processing Systems (NIPS 2003)", volume = "16", publisher = "MIT Press", address = "Cambridge, MA", year = "2004", pdf={http://books.nips.cc/papers/files/nips16/NIPS2003_AA04.pdf} }
 C. G. NevillManning and I. H. Witten: Identifying hierarchical structure in sequences: a lineartime algorithm. Journal of artificial intelligence research, 7, 6782, 1997. [Bibtex]
@Article{ nevillmanningwitten:97, author = "Craig G. NevillManning and Ian H. Witten", title = "Identifying hierarchical structure in sequences: A lineartime algorithm", journal = "Journal of Artificial Intelligence Research", volume = "7", pages = "6782", year = "1997", pdf={http://arxiv.org/pdf/cs/9709102.pdf} }
 R. M. Neal and G. E. Hinton: A view of the em algorithm that justifies incremental, sparse, and other variants. Learning in graphical models, 89, 355–368, 1998. [Bibtex]
@Article{ nealhinton:98, title = {A view of the EM algorithm that justifies incremental, sparse, and other variants}, author = {Neal, R.M. and Hinton, G.E.}, journal = {Learning in graphical models}, volume = {89}, pages = {355368}, year = {1998}, pdf={ftp://ftp.cdf.toronto.edu/dist/radford/emk.pdf} }
 T. P. Minka: Expectation propagation for approximate Bayesian inference. In Proc. of the 17th annual conf.\ on uncertainty in ai (uai 2001), 362369, 2001. [Bibtex]
@InProceedings{ minka:01uai, author = "T. P. Minka", title = "Expectation propagation for approximate {B}ayesian inference", booktitle = "Proc. of the 17th Annual Conf.\ on Uncertainty in AI (UAI 2001)", pages = "362369", year = "2001", pdf={http://arxiv.org/pdf/1301.2294v1.pdf} }
 L. Li, M. L. Littman, T. J. Walsh, and A. L. Strehl: Knows what it knows: a framework for selfaware learning. Machine learning, 82, 399–443, Springer, 2011. [Bibtex]
@Article{ li2011knows, title = {Knows what it knows: a framework for selfaware learning}, author = {Li, Lihong and Littman, Michael L and Walsh, Thomas J and Strehl, Alexander L}, journal = {Machine learning}, volume = {82}, number = {3}, pages = {399443}, year = {2011}, publisher = {Springer}, pdf={http://www.research.rutgers.edu/~lihong/pub/Li08Knows.pdf} }
 J. Lafferty, A. McCallum, and F. Pereira: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Int.\ conf.\ on machine learning (icml 2001), 282289, 2001. [Bibtex]
@InProceedings{ lafferty:01, title = "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data", author = "J. Lafferty and A. McCallum and F. Pereira", booktitle = "Int.\ Conf.\ on Machine Learning (ICML 2001)", pages = "282289", year = "2001", pdf={http://www.cis.upenn.edu/~pereira/papers/crf.pdf} }
 Kschischang, Frey, and Loeliger: Factor graphs and the sumproduct algorithm. Ieee transactions on information theory, 47, 2001. [Bibtex]
@Article{ kschischangfreyloelinger:01, author = "Kschischang and Frey and Loeliger", title = "Factor graphs and the sumproduct algorithm", journal = "IEEE Transactions on Information Theory", volume = "47", year = "2001", pdf= {http://www.psi.toronto.edu/pubs/2001/frey2001factor.pdf} }
 Heckerman: Graphical models: structure learning. In The handbook of brain theory and neural networks (2nd edition), MIT Press, 2002. [Bibtex]
@InCollection{ heckerman:02, author = "Heckerman", year = "2002", title = "Graphical Models: Structure Learning", booktitle = "The Handbook of Brain Theory and Neural Networks (2nd edition)", publisher = "MIT Press", pdf = {http://mlg.eng.cam.ac.uk/zoubin/course04/hbtnn2eIII.pdf} }
 G. Elidan, N. Lotner, N. Friedman, and Daphne Koller: Discovering hidden variables: a structurebased approach. In NIPS, 479485, 2000. [Bibtex]
@InProceedings{ elidanetal:00, author = "Gal Elidan and Noam Lotner and Nir Friedman and Daphne Koller", title = "Discovering Hidden Variables: A StructureBased Approach", booktitle = "{NIPS}", pages = "479485", year = "2000", pdf = {http://www.cs.huji.ac.il/~nir/Papers/ELFK1.pdf} }
 A. P. Dempster, N. M. Laird, and D. B. Rubin: Maximum likelihood from incomplete data via the em algorithm. Journal of the royal statistical society. series b (methodological), 1–38, JSTOR, 1977. [Bibtex]
@Article{ dempster1977maximum, title = {Maximum likelihood from incomplete data via the EM algorithm}, author = {Dempster, Arthur P and Laird, Nan M and Rubin, Donald B}, journal = {Journal of the Royal Statistical Society. Series B (Methodological)}, pages = {138}, year = {1977}, publisher = {JSTOR}, pdf = {http://people.cs.missouri.edu/~chengji/mlbioinfo/dempster_em.pdf} }
 D. A. Cohn, Z. Ghahramani, and M. I. Jordan: (long version) active learning with statistical models. In Advances in neural information processing systems, 7, 705–712, The {MIT} Press, 1995. [Bibtex]
@InProceedings{ cohnghahramanijordan:95, author = "David A. Cohn and Zoubin Ghahramani and Michael I. Jordan", title = "(LONG VERSION) Active Learning with Statistical Models", booktitle = "Advances in Neural Information Processing Systems", volume = "7", publisher = "The {MIT} Press", editor = "G. Tesauro and D. Touretzky and T. Leen", pages = "705712", year = "1995", pdf = {http://www.jair.org/media/295/live2951554jair.pdf} }
 M. I. Jordan, D. A. Cohn, and Z. Ghahramani: Active learning with statistical models. MASSACHUSETTS INST OF TECH CAMBRIDGE ARTIFICIAL INTELLIGENCE LAB} pdf={http://www.textfiles.com/bitsavers/pdf/mit/ai/aim/AIM1522.pdf, 1995. [Bibtex]
@misc{jordan1995active, title={Active Learning with Statistical Models}, author={Jordan, Michael I and Cohn, David A and Ghahramani, Zoubin}, year={1995}, publisher={MASSACHUSETTS INST OF TECH CAMBRIDGE ARTIFICIAL INTELLIGENCE LAB} pdf={http://www.textfiles.com/bitsavers/pdf/mit/ai/aim/AIM1522.pdf} }
 M. J. Beal, Z. Ghahramani, and C. Edward Rasmussen: The infinite hidden markov model. In Advances in neural information processing systems 14, MIT Press, 2002. [Bibtex]
@InProceedings{ bealetal:02, author = "Matthew J. Beal and Zoubin Ghahramani and Carl Edward Rasmussen", title = "The Infinite Hidden Markov Model", booktitle = "Advances in Neural Information Processing Systems 14", editor = "T. Dietterich and S. Becker and Z. Ghahramani", publisher = "MIT Press", year = "2002", pdf = {http://books.nips.cc/papers/files/nips14/AA01.pdf} }
 H. Akaike: A new look at the statistical model identification. Ieee transactions on automatic control, AC–19, 716–723, For a reprint see E. Parzen et al. (Eds.), \emph{Selected Papers of Hirotugu Akaike}, Springer Series in Statistics, 1998, 1974. [Bibtex]
@Article{ akaike:74, author = "H. Akaike", title = "A new look at the statistical model identification", journal = "IEEE Transactions on Automatic Control", volume = "AC19", pages = "716723", year = "1974", note = "For a reprint see E. Parzen et al. (Eds.), \emph{Selected Papers of Hirotugu Akaike}, Springer Series in Statistics, 1998", pdf = {http://www.unt.edu/rss/class/Jon/MiscDocs/Akaike_1974.pdf} }