Skip to content

Latest commit

 

History

History
423 lines (379 loc) · 20.8 KB

TODO.org

File metadata and controls

423 lines (379 loc) · 20.8 KB

Revise the Introduction Chapter

Write the Sentiment Corpus Chapter

Write Related Work

Write Summary and Conclusion

Revise the Chapter

Sent the Chapter to the Supervisor

Incorporate Supervisor’s Feedback

Write the Sentiment Lexicon Chapter

Make Scaffold

Write Introduction

Describe German Lexicon Baseline

Fill Evaluation Tables

Describe Dictionary-Based Methods

Describe Esuli and Sebastiani

Describe Rao:08

Describe Kim:06

Describe Blair-Goldensohn:08

Describe Mohammad:09

Describe Hassan:10

Describe Dragut:10

Describe GermaNet

IN-PROGRESS Fill Evaluation Tables

Re-implement and evaluate Hu:04

IN-PROGRESS Re-implement and evaluate Blair-Goldensohn:08

IN-PROGRESS Re-implement and evaluate Kim:04,Kim:06

Re-implement and evaluate Esuli:06c

IN-PROGRESS Re-implement and evaluate Rao:09 Min-Cut

IN-PROGRESS Re-implement and evaluate Rao:09 Label Propagation

IN-PROGRESS Re-implement and evaluate Awdallah:10

Describe Corpus-Based Methods

Implement the method of Velikovich et al.

Test and evalute the method of Velikovich et al.

Implement the method of Kiritchenko et al.

Implement the method of Severyn et al.

Describe NWE Methods

Estimate Vo

Estimate Tang

IN-PROGRESS Estimate PCA

Describe Linear Projection

IN-PROGRESS Finish the section

Describe Evaluation

Estimate the Effect of Seed Sets

Estimate the Effect of Word Embeddings

Estimate the Effect of Vector Normalization

Provide Examples of NWE-based methods

Finish the section

Write Summary and Conclusions

Revise the Chapter

Incorporate Supervisor’s Feedback

S 44: Ich weiß nicht ob wir darüber schon gesprochen hatten, aber die Gleichsetzung von “polar words” und “emotional expressions” scheint mir nicht ganz die Standardterminologie zu sein. Emotions-Analysen kennen auch Dimensionen wie agitated/subdued und anderes, die interessant sind, aber definitiv orthogonal zu Polarität/Valenz. M.a.W., ich würde “Polariät” als eine der Dimensionen von “Emotion” sehen, aber nicht als dasselbe.

S. 44 und folgende: “updated version of dataset” klingt potenziell verwirrend ;-) Die Dissertation soll nicht den mehrschrittigen Prozess der Genese des Dataset nacherzählen, sondern idealerweise nur das Endresultat beschreiben. Alles andere verwirrt (und ohnehin wissen wir schon, dass wir eher auf Kürze achten müssen). Vor allem ist es auch nicht gut, wenn Erweiterungen des Dataset in späteren Kapiteln eingeführt werden, und nicht im Dataset-Kapitel.

S. 46: SentiWS: weil Du vorher schon sagtest, dass alle Lexika durch Übersetzung entstanden sind: Hier wäre wichtig zu erfahren, wieviele der Einträge aus GeneralInquirer übersetzt wurden und wieviele dann durch Kollokationsanalysen ergänzt wurden (denn die sind ja nicht übersetzt)

S. 46: ZPL: ist also gar nicht durch Übersetzung entstanden, i.ggs. zu der vorangehenden Aussage?

Insgesamt hier vielleicht noch ein Argument geben, warum keine Zirkularität besteht: Du hast ja erst Deine Daten durch Vergleich mit den Lexika verändert/verbessert, und anschließend evaluierst Du die Lexika anhand selbiger Daten.

S. 47: Mir ist nicht klar, wieso die Lexika so hohen recall für “neutral” haben. Gemäß ihrer Entstehungsart sollten sie eigentlich nur polare Wörter enthalten, oder nicht?

S.47 Mitte. “recomputed on the whole corpus” - im gegensatz zu welchem vorher verwendeten Teilkorpus?

S. 48 Anfang von 3.3.3 (nochmal dieser Punkt): Diese beschreibung der Entstehung harmoniert nicht mit denen auf S 45/46 - definitiv nicht für ZPL und nur partiell für SentiWS. Ein bisschen verwirrend ist, dass der erste Absatz Übersetzung als zentrale Methode einführt, Du in den folgenden Absätzen aber “nur” dictionary- und corpus-based LG nennst (was mit Übersetzung gar nichts zu tun hat, wie man erst beim zweiten Lesen merkt)

S. 49: 2. Absatz, die Methode ist mir nicht ganz klar. Welche synsets gehen in die adjacency matrix ein?

3. Absatz: “following” ist hier aber nicht temporal gemeint, denn KimHovy 2004 folgt nicht auf Blair-Goldensohn 2008

S. 50 oben: im Ernst? positiv und neutral werden einfach zusammengefasst? Zu Beginn der Beschreibung von KimHovy werden sie noch auseinander gehalten. Also werden sie nur für einen bestimmten Teilschritt zusammengelegt?

S. 52: Hier geht zwischen Tab 3.10 und dem Absatz unter Fig 3.6 etwas durcheinander. Wahrscheinlich meinst Du in 3.10 statt “Hyper. Rels” eher “Holonym Rels”? (die ich übrigens eher “meronym rels” nennen würde, weil Meronymie die gebräuchlichere “Richtung” ist

OK, das Ende von S 52 ist ohnehin noch Baustelle.

S. 54: schwierig, dass hier nochmal wieder related work kommt - wir hatten ja in dem Kapitel schon reichlich davon. Hab ich jetzt mal nicht weitergelesen.

Write the Fine-Grained Sentiment Analysis Chapter

Write Introduction

Describe Rules for Determining Text Spans

Describe Evaluation Metrics

IN-PROGRESS Describe Conditional Random Fields

Describe Recurrent Neural Networks

Describe Evaluation

Describe Effect of the Annotation Scheme

Describe Effect of Topology

IN-PROGRESS implement tree-structured models

Describe Effect of Features

Describe Effect of Word Embeddings

implement ts-w2v-lst-sq

Describe Effect of Lexicons and Normalization

Revise Evaluation

Describe Related Work

Revise Related Work

Write Summary and Conclusions

Revise Chapter

Send Chapter to the Supervisor

Incorporate Supervisor’s Feedback

Error Analysis CRF

Error Analysis LSTM

Error Analysis GRU

Successive Prediction

Write the Coarse-Grained Sentiment Analysis Chapter

Implement Evaluation Script

Describe Evaluation Metrics

Describe Data Preparation

Add Lexicons

GPC

SWS

ZPL

Hu-Liu (Esuli-Sebastiani seed set)

Blair-Goldensohn (Kim-Hovy seed set)

Kim-Hovy (Turney-Littman Seedset)

Esuli-Sebastiani (Esuli-Sebastiani seed set)

RR (mincut) (Remus seed set)

RR (label propagation) (Kim Hovy seed set)

Awdallah-Radev (Kim Hovy seed set)

Takamura (Hu-Liu seed set)

Velikovich (Kim Hovy seed set)

Kiritchenko (Kim Hovy seed set)

Severyn (Kim Hovy seed set)

Tang (Kim Hovy seed set)

Vo (Kim Hovy seed set)

Nearest Centroids (Kim Hovy seed set)

k-NN (Kim Hovy seed set)

PCA (Kim Hovy seed set)

LP (Kim Hovy seed set)

Normalize Lexicon Scores

GPC

SWS

ZPL

Hu-Liu (Esuli-Sebastiani seed set)

Blair-Goldensohn (Kim-Hovy seed set)

Kim-Hovy (Turney-Littman Seedset)

Esuli-Sebastiani (Esuli-Sebastiani seed set)

RR (mincut) (Remus seed set)

RR (label propagation) (Kim Hovy seed set)

Awdallah-Radev (Kim Hovy seed set)

Takamura (Hu-Liu seed set)

Velikovich (Kim Hovy seed set)

Kiritchenko (Kim Hovy seed set)

Severyn (Kim Hovy seed set)

Tang (Kim Hovy seed set)

Vo (Kim Hovy seed set)

Nearest Centroids (Kim Hovy seed set)

k-NN (Kim Hovy seed set)

PCA (Kim Hovy seed set)*

LP (Kim Hovy seed set)

Add PoS-Tags to the Lexicons

Describe Lexicon-Based Methods

Describe Hu-Liu (2004)

Describe Taboada et al. (2011)

Describe Musto et al. (2014)

Describe Jurek et al. (2015)

Describe Kolchyna et al. (2015)

Reimplement Lexicon-Based Methods

Reimplement Hu-Liu (2004)

Reimplement Taboada et al. (2011)

Reimplement Musto et al. (2014)

Reimplement Jurek et al. (2015)

Reimplement Kolchyna et al. (2015)

Evaluate Lexicon-Based Methods

Evaluate LB Approaches on Normalized PotTS Data

Evaluate LB Approaches on Unnormalized PotTS Data

Evaluate LB Approaches on Normalized SB10k Data

Evaluate LB Approaches on Unnormalized SB10k Data

Evaluate Different Lexicon Steps

Describe Different Lexicon Steps

Describe Evaluation of Lexicon-Based Methods

Describe ML-Based Methods

Reimplement ML-Based Methods

Reimplement Gamon, 2004

Reimplement Mohammad, 2013

Reimplement Guenther, 2014

Describe Evaluation of ML-Based Methods

Perform and Describe Feature Ablation Test

Evaluate Different Classifiers

Describe Error Analysis

Revise ML-Based Methods

Describe DL-Based Methods

Describe Choi and Cardie (2008)

Describe Moilanen and Pulman (2007)

Describe Nakagawa (2010)

Describe Yessenalina and Cardie (2010)

Describe Socher et al. (2012)

Describe Socher et al. (2013)

Describe Wang (2015)

Describe Baziotis:17}}

Describe Cliche:17}}

Describe Rouvier:17}}

Revise the Descriptions

Reimplement DL-Based Methods

Reimplement Yessenalina and Cardie (2010)

Reimplement Socher et al. (2011)

Reimplement Socher et al. (2012)

Reimplement Socher et al. (2013)

Reimplement Severyn et al. (2015)

Reimplement Baziotis et al. (2017)

Implement own DL-Based Method

Evaluate DL-Based Methods

IN-PROGRESS Evaluate the Effect of Different Embedding Types

Perform an Error Analysis

Describe Evaluation of DL-Based Methods

Describe Effect of Embeddings

IN-PROGRESS Perform Error Analysis

Perform General Evaluation

Describe Effect of Distant Supervision

Describe Effect of the Lexicons

Describe Effect of Text Normalization

Write Summary and Conclusions

Incorporate Supervisor’s Feedback

102: Ich würde hier noch ein klein wenig ausführlicher ankündigen, dass Du ausgewählte (weil erfolgreiche) Ansätze re-implementieren wirst, und im DL Kapitel dann auch einen eigenen Ansatz (bzw Modifikation von Baziotis) vorschlägst.

Sct 6.1

S 103 warum ist macro F1 über pos und neg eine Alternative zu micro-avg über pos/neg/neut? Sollte man nicht immer dieselbe Zahl von Klassen haben?

Sct 6.2

S 103 Statt TreeTagger besser den Tweet-Tagger von Ines Rehbein nutzen?

S 103 Ref PotTS Korpus sollte nicht ein paper, sondern das Kapitel dieser Diss sein

S 103 Was ist der Zweck der sehr einfachen Pol.-Bestimmung?

S 104 Erklärg warum Bsp 6.2.2 positiv annotiert wird?

S 105 Warum hat PotTS ein IAA - davon war im Text bisher nicht die Rede.

S 105 “As you might remember” unüblich, den Leser direkt anzusrpechen (kommt auch später vor.)

Sct 6.3

S 109 Test the earlier work on PotTS and SB10k, but why not on the 3rd corpus? (Or in other words, why was the 3rd corpus introduced?)

S 109 “drawback of this resource, which unfortunately slipped through our previous intrinsic evaluation” - was heißt das?

Sct 6.3.1, 6.3.2: interessant!

S 112 Ex 6.3.1 might point to a problem of re-implementing rules of an English system for German, where word order is much less restricted

S 112 Ex 6.3.2 Isn’t the question the reason for nullifying the score? (And rightly so, I believe.)

S 113 what is an “informative part of speech”?

S 114 Ex 6.3.5 Is “Ok” a good translation of “Normal” ?

S 115 Warum tritt derselbe tweet zweimal im cluster auf?

Sct 6.4

S 118ff: If (many of) these approaches use sentiment lexicons in their feature space, is the dividing line between lexicon methods and ML methods really so clear?

S 120 First mention of “the Linear Projection lexicon” in this chapter. Please remind the reader what it is. (section reference)

S 121 As indicated earlier. at least one German Twitter PoS taggers does exist now. A comparison to TreeTagger would be really interesting here.

S 126 some equations have a number, some do not

S 126 in teh Ye./Cardie 11 approach, what are the vectors u and v ?

S 127 line 1: ist “child” = “dependent” und “vector” = “embedding”? Dann besser identische Bezeichner benutzen

S 130 der Übergang vom Referieren früherer Arbeiten zum Vorschlag eines eigenen klarer markiert sein - vielleicht durch separate subsubsections.

S 138 6.5.5 ist natürlich ein schöner Hinweis auf einen möglichen Mehrwert von lokalen Kohärenzrelationen ;-)

Revise the Chapter

Write the Discourse-Level Sentiment Analysis Chapter

Write Introduction

Write Related Work

Add summary of Riloff et al. (2003)

Add summary of Riloff et al. (2003a)

Add summary of Pang et al. (2002)

Add summary of Pang et al. (2004)

Check Section 3.6 of Hu and Liu (2004)

Add summary of Snyder and Barzilay (2007)

Add summary of Asher (2008)

Add summary of Heerschop (2011)

Add summary of Zhou (2011)

Add summary of Zirn (2011)

Add summary of Chenlo (2013)

Prepare Data

Retrain Ji’s Parser on PCC

Add Discourse Parses to DASA

Revise Related Work

Reimplement and evaluate common DASA approaches

Reimplement and Evaluate Last EDU Classifier

Reimplement and Evaluate Root Classifier

Reimplement and Evaluate Discourse-Unaware Classifier

Reimplement and Evaluate DDR Classifier

Reimplement and Evaluate R2N2 Classifier

Reimplement and Evaluate Wang Classifiers

Devise own DASA Method

Evaluate Softmax

Evaluate Custom Simplex Normalization

Evaluate Sparsemax

Evaluate Best Strategy on the Dependency Tree Representati

Perform Error Analysis

Perform and Describe Evaluation

Describe the Effect of Base Classifiers

Analyze the Effect of Discourse Relation Sets

Write Summary and Conclusions

Revise the chapter

Incorporate Supervisor’s Feedback

Submit the Dissertation to the Deanery

Prepare Documents

Ph.D. Application

Declaration in which subject I’m pursuing the degree

Declaration that I’m not pursuing a degree at any other unversity

Declaration that the work has been completed without external help and according to the best scientific standards

CV

Summary

Diploma

Dissertation

Publication List

Suggestion for Committee

Suggestion for Reviewers

Criminal Record Certificate

Print the Documents

Ph.D. Application

Declaration in which subject I’m pursuing the degree

Declaration that I’m not pursuing a degree at any other unversity

Declaration that the work has been completed without external help and according to the best scientific standards

CV

Summary

Diploma

Dissertation

Publication List

Suggestion for Committee

Suggestion for Reviewers

Criminal Record Certificate

Bring the documents to the deanery

Write the Theses

Write the Theses Paper

Send the Paper to the Deanery

Prepare the Presentation

Prepare the Presentation

Rehearse the Presentation

IN-PROGRESS Incorporate Final Corrections

Eisenstein

Chapter 2

For replicability, it would be good to include the complete keyword lists alongside the annotator instructions in an appendix.

I want to poke a little at the definition of targets as “entities or events evaluated by opinions.”

I wonder whether the initial low levels of agreement stemmed from a lack of clarity in the original instructions.

I didn’t understand the correlation analysis in table 2.6.

Chapter 3

Local maximum

Chapter 4

It might help to remind readers of this size of the training and test sets, and to indicate how many features from the training set are unseen in the test set, and vice versa

I would also like to see how F1 evolved across the space of regularization parameters, and to know how the final regularization parameter was selected.

I would have liked to know more about how inference and learning was implemented in these structures, since the “off-the-shelf” Viterbi and forward-backward algorithms are not immediately applicable to Semi-Markov and Tree-structured models.

Chapter 5

I am skeptical of the use of emoticons to label tweets, despite the fact that this is done in prior work: there’s good evidence that the “smiley” emoticon is used for many pragmatic purposes aside from indicating sentiment, such as softening face-threatening speech acts

I would relabel “distant supervision” as “semi-supervised learning” or “weak supervision”, as “distant supervision” typically refers to supervision from type-level resources such as knowledge bases.

Chapter 6

But I couldn’t understand why the No-Discourse method also improved in this setting.

Stede

Chapter 1

In doing this, it goes some way to providing an account of the state of the art, but one thing the reader misses is a concise definition of SA; here it would have been sufficient to quote one from influential literature, such as the book by Liu (2012).

Also, giving a few more examples to illustrate the range of subtasks and possible domains would be helpful.

Chapter 2

Polar words get a two-valued strength attribute, where I wonder why the two values are strong and medium, rather than weak.

here a slightly broader discussion, which looks at IAA treatment in related work on span labeling, and maybe specifically considers the potential utility of Krippendorf’s unitized alpha, would have been nice.

US does not build a single gold standard – a decision that could have been briefly discussed at the end of the chapter.

Chapter 4

To evaluate the work, US proposes to use a token-sensitive measure suggested in related work for other purposes. For appreciating this decision, it would be good to get information on how fine- grained SA approaches for English usually handle this. Likewise, for methods and results a brief overview of related work would here be helpful.

Chapter 5

In the section on machine-learning methods, I would appreciate a sub/section break between the extensive related work part and the author’s own proposal and implementation.

Official

Fix bibliography

Rename distant supervision

Clarify the gold format

Re-read the thesis