<?xml 
version="1.0" encoding="utf-8"?><?xml-stylesheet title="XSL formatting" type="text/xsl" href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=backend.xslt" ?>
<rss version="2.0" 
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:atom="http://www.w3.org/2005/Atom"
>

<channel xml:lang="fr">
	<title>MC2 2018 Lab</title>
	<link>https://clef2018.clef-initiative.eu/mc2/</link>
	<description>MC2 CLEF Lab is centered on mining the social media sphere surrounding cultural events such as festivals and movies, It provides access for registered participants to the microbolg collection of the GAFES project funded by the French National Research Agency and lead by the University of Avignon.</description>
	<language>fr</language>
	<generator>SPIP - www.spip.net</generator>
	<atom:link href="https://clef2018.clef-initiative.eu/mc2/spip.php?id_rubrique=3&amp;page=backend" rel="self" type="application/rss+xml" />




<item xml:lang="en">
		<title>TimeLine Illustration based on Microblogs</title>
		<link>https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=14</link>
		<guid isPermaLink="true">https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=14</guid>
		<dc:date>2016-10-19T19:42:26Z</dc:date>
		<dc:format>text/html</dc:format>
		<dc:language>en</dc:language>
		<dc:creator>Lorraine, Philippe</dc:creator>


		<dc:subject>CLEF 2016</dc:subject>

		<description>
&lt;p&gt;This paper by Nayanika DOGRA, Philippe MULHEM, Nawal OULD AMER, and Lorraine GOEURIOT presents the approach used by the LIG-MRIM research group to the participation of the pilot task TimeLine illustration based on Microblogs for the 2016 CLEF Cultural Microblog Contextualization WorkShop that lead to the 2017 lab.&lt;/p&gt;


-
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=rubrique&amp;id_rubrique=7" rel="directory"&gt;3 - Time Line Illustration&lt;/a&gt;

/ 
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=mot&amp;id_mot=3" rel="tag"&gt;CLEF 2016&lt;/a&gt;

		</description>


 <content:encoded>&lt;div class='rss_texte'&gt;&lt;p&gt;This paper by Nayanika DOGRA, Philippe MULHEM, Nawal OULD AMER, and Lorraine GOEURIOT presents the approach used by the LIG-MRIM research group to the participation of the pilot task TimeLine illustration based on Microblogs for the 2016 CLEF Cultural Microblog Contextualization WorkShop that lead to the 2017 lab.&lt;/p&gt;&lt;/div&gt;
		&lt;div class="hyperlien"&gt;View online : &lt;a href="http://ceur-ws.org/Vol-1609/16091201.pdf" class="spip_out"&gt;LIG at CLEF 2016 Cultural Microblog Contextualization: TimeLine Illustration based on Microblogs&lt;/a&gt;&lt;/div&gt;
		
		</content:encoded>


		

	</item>
<item xml:lang="en">
		<title>Wikipedia XML corpus for summary generation</title>
		<link>https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=13</link>
		<guid isPermaLink="true">https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=13</guid>
		<dc:date>2016-10-18T16:44:45Z</dc:date>
		<dc:format>text/html</dc:format>
		<dc:language>en</dc:language>
		<dc:creator>sanjuan</dc:creator>


		<dc:subject>data</dc:subject>
		<dc:subject>CLEF 2016</dc:subject>

		<description>
&lt;p&gt;Wikipedia is under Creative Commons license, and its contents can be used to contextualize tweets or to build complex queries referring to Wikipedia entities. &lt;br class='autobr' /&gt;
We have extracted an average of 10 million XML documents from Wikipedia per year since 2012 in the four main twitter languages:- en, es, fr and pt. &lt;br class='autobr' /&gt;
These documents reproduce in an easy-to-use XML structure the contents of the main Wikipedia pages: title, abstract, section and subsections as well as Wikipedia internal links. Other (&#8230;)&lt;/p&gt;


-
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=rubrique&amp;id_rubrique=5" rel="directory"&gt;1 - Content Analysis&lt;/a&gt;

/ 
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=mot&amp;id_mot=2" rel="tag"&gt;data&lt;/a&gt;, 
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=mot&amp;id_mot=3" rel="tag"&gt;CLEF 2016&lt;/a&gt;

		</description>


 <content:encoded>&lt;div class='rss_texte'&gt;&lt;p&gt;Wikipedia is under Creative Commons license, and its contents can be used to contextualize tweets or to build complex queries referring to Wikipedia entities.&lt;/p&gt;
&lt;p&gt;We have extracted an average of 10 million XML documents from Wikipedia per year since 2012 in the four main twitter languages:- en, es, fr and pt.&lt;/p&gt;
&lt;p&gt;These documents reproduce in an easy-to-use XML structure the contents of the main Wikipedia pages: title, abstract, section and subsections as well as Wikipedia internal links. Other contents such as images, footnotes and external links are stripped out in order to obtain a corpus easier to process using standard NLP tools.&lt;/p&gt;
&lt;p&gt;By comparing contents over the years, it is possible to detect long term trends&lt;/p&gt;&lt;/div&gt;
		&lt;div class="hyperlien"&gt;View online : &lt;a href="http://tc.talne.eu/" class="spip_out"&gt;Micro Blog Contextualization CLEF &amp; Inex tracks data and tools&lt;/a&gt;&lt;/div&gt;
		
		</content:encoded>


		

	</item>
<item xml:lang="en">
		<title>Microblog Cultural Contextualization 2017 lab introduction</title>
		<link>https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=11</link>
		<guid isPermaLink="true">https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=11</guid>
		<dc:date>2016-10-18T12:38:54Z</dc:date>
		<dc:format>text/html</dc:format>
		<dc:language>en</dc:language>
		<dc:creator>sanjuan</dc:creator>



		<description>
&lt;p&gt;These are the slides used to presented at CLEF 2016 in Evora to introduce the CM2 lab. Overall Procedure Take a microblog about an event with an url. Identify its language. Identify a related cultural event or filter it out. Reveal When, Where, Who ... Relate it to Wikipedia entities2017 Organization Task 1: language, filtering and localization lead by Toulouse, Montr&#233;al and Paris starts &#8230; now! Task 2: entity extraction, summarization and linking starts in November 2016 lead by Avignon, (&#8230;)&lt;/p&gt;


-
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=rubrique&amp;id_rubrique=3" rel="directory"&gt;Tasks 2017&lt;/a&gt;


		</description>


 <content:encoded>&lt;div class='rss_texte'&gt;&lt;p&gt;These are the slides used to presented at CLEF 2016 in Evora to introduce the CM2 lab.&lt;/p&gt;
&lt;h2 class=&#034;spip&#034;&gt;Overall Procedure&lt;/h2&gt;&lt;ol class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Take a microblog about an event with an url.&lt;/li&gt;&lt;li&gt; Identify its language.&lt;/li&gt;&lt;li&gt; Identify a related cultural event or filter it out.&lt;/li&gt;&lt;li&gt; Reveal When, Where, Who ...&lt;/li&gt;&lt;li&gt; Relate it to Wikipedia entities&lt;/li&gt;&lt;/ol&gt;&lt;h2 class=&#034;spip&#034;&gt;2017 Organization&lt;/h2&gt;&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Task 1: language, filtering and localization lead by Toulouse, Montr&#233;al and Paris starts &#8230; now!&lt;/li&gt;&lt;li&gt; Task 2: entity extraction, summarization and linking starts in November 2016 lead by Avignon, London University and Syllabs.&lt;/li&gt;&lt;li&gt; Task 3: time-line illustration starts in January 2017 lead by Grenoble.&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;More inside the slides ...&lt;/p&gt;&lt;/div&gt;
		&lt;div class="hyperlien"&gt;View online : &lt;a href="https://docs.google.com/presentation/d/1d09TE5Za5AizOAOQE71WaCyTPkb81rgTUis8mlTPbQg/edit?usp=sharing" class="spip_out"&gt;Evora's CM2 lab presentation slides&lt;/a&gt;&lt;/div&gt;
		
		</content:encoded>


		

	</item>
<item xml:lang="en">
		<title>Cultural Microblog Contextualization based on Wikipedia</title>
		<link>https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=8</link>
		<guid isPermaLink="true">https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=8</guid>
		<dc:date>2016-03-31T21:40:51Z</dc:date>
		<dc:format>text/html</dc:format>
		<dc:language>en</dc:language>
		<dc:creator>Jian-Yun Nie, josiane, Liana Ermakova</dc:creator>



		<description>
&lt;p&gt;Organizers: &lt;br class='autobr' /&gt;
Liana Ermakova, Josiane Mothe, Jian-Yun Nie (cmct1@irit.fr) &lt;br class='autobr' /&gt;
Task 1 participation deadline extended to 23 May, 2016 &lt;br class='autobr' /&gt;
Objective &lt;br class='autobr' /&gt;
The aim of this task is to generate a short summary providing background information for a tweet to help a user understand it. For instance, if a microblog announced a cultural event, participants would have to provide a short summary extracted from Wikipedia that provides -extensive -background about this event. The summary must contain (&#8230;)&lt;/p&gt;


-
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=rubrique&amp;id_rubrique=5" rel="directory"&gt;1 - Content Analysis&lt;/a&gt;


		</description>


 <content:encoded>&lt;div class='rss_texte'&gt;&lt;p&gt;&lt;strong&gt;Organizers: &lt;br class='autobr' /&gt;
&lt;/strong&gt;&lt;br class='autobr' /&gt;
&lt;i&gt;Liana Ermakova, Josiane Mothe, Jian-Yun Nie (cmct1@irit.fr)&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;Task 1 participation deadline extended to &lt;strong&gt;23 May, 2016&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Objective&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The aim of this task is to generate a short summary providing background information for a tweet to help a user understand it. For instance, if a microblog announced a cultural event, participants would have to provide a short summary extracted from Wikipedia that provides -extensive -background about this event. The summary must contain information about the context of the event in order to help answering questions like &#034;what is this tweet about?&#034; using a recent cleaned dump of Wikipedia. The context should be in the form of a readable summary, not exceeding 500 words, composed of passages from the provided Wikipedia corpus.&lt;/p&gt;
&lt;p&gt;Any open access resources can be used in addition to the data we provide to participants' subject for describing it and providing a valid URL.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data&lt;/strong&gt;&lt;/p&gt;
&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Tweets to contextualize: We select a set of 1001 tweets to be contextualized by the participants using the English version of Wikipedia. These tweets in English are collected from a set of public micro-blogs on Twitter and are related to the keyword &#8220;festival&#8221;. The microblogs are in UTF8 csv format with various fields. In this task, the tweets do not contain URL. The other tasks will use additional information.&lt;/li&gt;&lt;/ul&gt;&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Wikipedia Crawl: Unlike tweets, Wikipedia is under Creative Commons license, and it's content can be used to contextualize tweets or to build complex queries referring to Wikipedia entities. We have extracted from Wikipedia an average of 10 million XML documents per year since 2012 in the four main twitter languages:- en, es, fr and pt. -These documents reproduce in an easy-to-use XML structure the contents of the main Wikipedia pages: title, abstract, section and subsections as well as Wikipedia internal links.&lt;br class='autobr' /&gt;
Other contents such as images, footnotes and external links are stripped out in order to obtain a corpus easy to process by standard NLP tools. By comparing contents over the years, it is possible to detect long term trends.&lt;/li&gt;&lt;/ul&gt;&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Format of the results&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;Results should be provided in CSV format:&lt;/p&gt;
&lt;div class=&#034;precode&#034;&gt;&lt;pre class='spip_code spip_code_block' dir='ltr' style='text-align:left;'&gt;&lt;code&gt;&lt;tid&gt; Q0 &lt;file&gt; &lt;rank&gt; &lt;rsv&gt; &lt;run_id&gt; &lt;text of passage 1&gt; &lt;tid&gt; Q0 &lt;file&gt; &lt;rank&gt; &lt;rsv&gt; &lt;run_id&gt; &lt;text of passage 2&gt; &lt;tid&gt; Q0 &lt;file&gt; &lt;rank&gt; &lt;rsv&gt; &lt;run_id&gt; &lt;text of passage 3&gt; ...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; &lt;p&gt;where:&lt;/p&gt;
&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; The first column is the tweet id (id field of the JSON format).&lt;/li&gt;&lt;li&gt; The second column is currently unused and should always be Q0.&lt;/li&gt;&lt;li&gt; The third column is the file name (without .xml) from which a result is retrieved, it is identical to the one in the Wikipedia document. Alternatively, the wikipedia page title can also be used.&lt;/li&gt;&lt;li&gt; The fourth column is the position number of the passage in the summary, independent of its informativeness.&lt;/li&gt;&lt;li&gt; The fifth column shows the score (integer or floating point) that should reflect the estimated informativeness of the passage. This score is used in the pooling process to build informativeness q-rels.&lt;/li&gt;&lt;li&gt; The sixth column is called the &#034;run tag&#034; and should be a unique identifier for your group AND for the method used.&lt;/li&gt;&lt;li&gt; The seventh column is the raw text of the Wikipedia passage. Text is given without XML tags and without formatting characters (avoid &#034;\n&#034;,&#034;\r&#034;,&#034;\l&#034;). The resulting word sequence has to appear in the file indicated in the third field.&lt;/li&gt;&lt;li&gt; The columns are separated by tabs.&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;br class='autobr' /&gt;
&lt;/strong&gt;&lt;/p&gt;
&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Topic 610507526174601216:
&lt;div class=&#034;precode&#034;&gt;&lt;pre class='spip_code spip_code_block' dir='ltr' style='text-align:left;'&gt;&lt;code&gt;Classes &#034;The scenic writings to the manipulated object.&#034; Francis and Peter were very promising in the art of manipulation. Some pictures of the live performances at Usine Tournefeuille.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Possible abstract:
&lt;div class=&#034;precode&#034;&gt;&lt;pre class='spip_code spip_code_block' dir='ltr' style='text-align:left;'&gt;&lt;code&gt;Marionnettissimo is a puppet festival, created by the association Et Qui Libre / Marionnettissimo (or EQL / Marionnettissimo), whose objective is the development of &#034;puppet culture&#034;, considering the public, artists, and cultural actors. The Marionnettissimo festival is part of a series of cultural actions, programming, training, conducted by the association since 1990. It takes place in the Toulouse area and the Midi-Pyrenees region, annually since 2006. The &#8220;scenic writings to the manipulated object&#8221; training was presented by Francis Monty from the La Pire Esp&#232;ce group (Quebec) and Pier Porcheron from the Elvis Alatac troupe (Poitou-Charentes) at Marionnettissimo festival from the 8th to the 19th of february 2016.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Formated result:
&lt;div class=&#034;precode&#034;&gt;&lt;pre class='spip_code spip_code_block' dir='ltr' style='text-align:left;'&gt;&lt;code&gt;610507526174601216 Q0 1693938 0 14.0	Marionnettissimo is a puppet festival, created by the association Et Qui Libre / Marionnettissimo (or EQL / Marionnettissimo), whose objective is the development of &#034;puppet culture&#034;, considering the public, artists, and cultural actors.
610507526174601216 Q0 1693938 1 12.0	The Marionnettissimo festival is part of a series of cultural actions, programming, training, conducted by the association since 1990.
610507526174601216 Q0 1693938 2 11.0	The &#8220;scenic writings to the manipulated object&#8221; training was presented by Francis Monty from the La Pire Esp&#232;ce group (Quebec) and Pier Porcheron from the Elvis Alatac troupe(Poitou-Charentes) at Marionnettissimo festival from the 8th to the 19th of february 2016.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Evaluation&lt;br class='autobr' /&gt;
&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The summaries will be evaluated according to:&lt;/p&gt;
&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Informativeness-: the way they overlap with relevant passages (number of them, vocabulary and bi-grams included or missing). For each tweet, all passages from all participants will be merged and displayed to the assessor in alphabetical order. Therefore, each passage's informativeness will be evaluated independently from others, even in the same summary. Assessors will have to provide a binary judgment on whether the passage is should appear in a summary on the topic, or not.&lt;/li&gt;&lt;/ul&gt;&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt;Readability assessed by evaluators and participants. Each participant will have to evaluate readability for a pool of summaries on an online web interface. Each summary consists of a set of passages and for each passage, assessors will have to tick four kinds of check boxes:&lt;/li&gt;&lt;/ul&gt;&lt;ol class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Syntax (S): tick the box if the passage contains a syntactic problem (bad segmentation for example),&lt;/li&gt;&lt;li&gt; Anaphora (A): tick the box if the passage contains an unsolved anaphora,&lt;/li&gt;&lt;li&gt; Redundancy (R): tick the box if the passage contains redundant information, i.e. information that has already been given in a previous passage,&lt;/li&gt;&lt;li&gt; Trash (T): tick the box if the passage does not make any sense in its context (i.e. after reading the previous passages). These passages must then be considered as trashed, and the readability of following passages must be assessed as if these passages were not present.&lt;/li&gt;&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Download the data:&lt;br class='autobr' /&gt;
&lt;/strong&gt;&lt;br class='autobr' /&gt;
Tweets to contextualize (&lt;a href=&#034;http://tc.talne.eu/task1_topics_en.xml&#034; class=&#034;spip_out&#034; rel=&#034;external&#034;&gt;download&lt;/a&gt;)&lt;br class='autobr' /&gt;
Wikipedia collection to use to contextualize the tweets (&lt;a href=&#034;http://tc.talne.eu&#034; class=&#034;spip_out&#034; rel=&#034;external&#034;&gt;download&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Submission&lt;br class='autobr' /&gt;
&lt;/strong&gt;&lt;br class='autobr' /&gt;
Participants should be registered at &lt;a href=&#034;http://clef2016-labs-registration.dei.unipd.it/registrationForm.php&#034; class=&#034;spip_url spip_out auto&#034; rel=&#034;nofollow external&#034;&gt;http://clef2016-labs-registration.dei.unipd.it/registrationForm.php&lt;/a&gt;. The personal access to the submission form is sent after the registration.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2016 Schedule&lt;br class='autobr' /&gt;
&lt;/strong&gt;&lt;/p&gt;
&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Topics and task guidelines released: 1 April&lt;/li&gt;&lt;li&gt; Run submission deadline : &lt;strong&gt;23 May (extended)&lt;/strong&gt;&lt;/li&gt;&lt;li&gt; Informativeness Evaluation results sent out: 5 June&lt;/li&gt;&lt;li&gt; Readability Evaluation results sent out: 5 June&lt;/li&gt;&lt;li&gt; Participant papers (CLEF proceedings) due: 7 June.&lt;/li&gt;&lt;li&gt; Overview paper due: 30 June&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;
		
		</content:encoded>


		

	</item>
<item xml:lang="en">
		<title>TimeLine illustration of a festival based on Microblogs</title>
		<link>https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=6</link>
		<guid isPermaLink="true">https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=6</guid>
		<dc:date>2015-11-03T13:10:11Z</dc:date>
		<dc:format>text/html</dc:format>
		<dc:language>en</dc:language>
		<dc:creator>Lorraine, Philippe</dc:creator>


		<dc:subject>CLEF 2016</dc:subject>

		<description>
&lt;p&gt;Objective &lt;br class='autobr' /&gt;
The goal of this task is to link the events of a festival program to a related microblog posts. This information is very important for attendees of festivals and for organizers to get feedback. &lt;br class='autobr' /&gt;
Microblog posts will be provided with their timestamps, which are crucial as a basis for the requested linking. &lt;br class='autobr' /&gt;
Participants will be have to provide a timetable for each event using the 10 best tweets based on their relevance and diversity. In this task, diversity is a must because (&#8230;)&lt;/p&gt;


-
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=rubrique&amp;id_rubrique=7" rel="directory"&gt;3 - Time Line Illustration&lt;/a&gt;

/ 
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=mot&amp;id_mot=3" rel="tag"&gt;CLEF 2016&lt;/a&gt;

		</description>


 <content:encoded>&lt;div class='rss_texte'&gt;&lt;p&gt;&lt;strong&gt;Objective&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The goal of this task is to link the events of a festival program to a related microblog posts. This information is very important for attendees of festivals and for organizers to get feedback.&lt;/p&gt;
&lt;p&gt;Microblog posts will be provided with their timestamps, which are crucial as a basis for the requested linking.&lt;/p&gt;
&lt;p&gt;Participants will be have to provide a timetable for each event using the 10 best tweets based on their relevance and diversity. In this task, diversity is a must because retrieving several times the same post is not beneficial in our case.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data&lt;/strong&gt;&lt;/p&gt;
&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Microblogs collection:&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;We collected all public micro-blog posts from twitter containing the keyword &#8220;festival&#8221; from June to September 2015 using a private archive service with twitter agreement based on streaming API. The average of unique micro-blog posts (i.e. without retweets) is 2,616,008 per month. The total number of collected posts is 13,167,910 without retweets and 24,228,699 with retweets.&lt;br class='autobr' /&gt;
These posts are provided in UTF8 csv format with various fields (tweet id, author name, language, &#8230;).&lt;br class='autobr' /&gt;
Because of privacy issues, this data cannot be publicly released but can be analyzed inside the organization that purchases these archives and among collaborators under privacy agreement. CLEF 2016 CMC workshop will provide this opportunity to share this data among participants. These archives can be indexed, analyzed and general results acquired from them can be published without restriction.&lt;/p&gt;
&lt;p&gt;Participants for this task will be provided with a subset of the microblogs collection, matching the months of targeted festivals (July and December 2015).&lt;/p&gt;
&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Festival programme:&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;Two French music festivals have been selected: the &lt;a href=&#034;http://www.vieillescharrues.asso.fr/&#034; class=&#034;spip_out&#034; rel=&#034;external&#034;&gt;festival des vieilles charrues&lt;/a&gt; and the &lt;a href=&#034;http://lestrans.com/&#034; class=&#034;spip_out&#034; rel=&#034;external&#034;&gt;transmusicales de Rennes&lt;/a&gt;. &lt;br class='autobr' /&gt;
The timelines provided are selected subset of each festival program: the organizers selected a subset of the whole festival program (for each stage and time, list of artists playing).&lt;/p&gt;
&lt;p&gt;The participants are free to use any additional data to provide results: social (popularity, &#8230;) or not (knowledge bases, &#8230;); it should be described in the related paper and specified when submitting the runs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We have selected 3 events from the festival des vieilles charrues. In the table are given 3 example tweets.&lt;/p&gt;
&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; 16-juil-15	18:45-19:45	Anna Calvi
&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; Anna Calvi Festival les Vieilles Charrues jeudi 16 juillet 2015 par Herve Le Gall via @shotsfr - &lt;a href=&#034;http://t.co/qL5lmRkZCb&#034; class=&#034;spip_url spip_out auto&#034; rel=&#034;nofollow external&#034;&gt;http://t.co/qL5lmRkZCb&lt;/a&gt;&lt;/li&gt;&lt;li&gt; Du Nouveau sur Taste Of Indie : Anna Calvi &#226;&#8364;&#8220; Festival des Vieilles Charrues 2015 &lt;a href=&#034;http://t.co/oZfS2jOKUJ&#034; class=&#034;spip_url spip_out auto&#034; rel=&#034;nofollow external&#034;&gt;http://t.co/oZfS2jOKUJ&lt;/a&gt; &lt;a href=&#034;http://t.co/B4DnyGxVll&#034; class=&#034;spip_url spip_out auto&#034; rel=&#034;nofollow external&#034;&gt;http://t.co/B4DnyGxVll&lt;/a&gt;&lt;/li&gt;&lt;li&gt; Ouest-France Vieilles Charrues. DIRECT - Doux d&#233;buts avec Anna Calvi, Soprano ... Ouest-France Le festival des&#226;&#8364;&#166; &lt;a href=&#034;http://t.co/G0VfPEnrs8&#034; class=&#034;spip_url spip_out auto&#034; rel=&#034;nofollow external&#034;&gt;http://t.co/G0VfPEnrs8&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt; 16-juil-15	20:10-21:45	Soprano
&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; RT @Sopranopsy4: Extraordinaire merci les vieilles es charrues merci la Bretagne!!!!&lt;/li&gt;&lt;li&gt; RT @Laura_AnneT: #charrues @soprano dingue surtout avec le maillot psg @MaxLaMendz3 t'es un client @GuillermNicola1 #rienafoutrederien&lt;/li&gt;&lt;li&gt; aux vieilles charrues on a tellement bien fait de pas aller voir soprano pour gratter des places pour muse putain&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt; 16-juil-15	22:00-23:30	Muse
&lt;ul class=&#034;spip&#034; role=&#034;list&#034;&gt;&lt;li&gt; MUSE Festival des Vieilles Charrues 2015 - Carhaix - Live HD &lt;a href=&#034;https://t.co/Qzokxb40V4&#034; class=&#034;spip_url spip_out auto&#034; rel=&#034;nofollow external&#034;&gt;https://t.co/Qzokxb40V4&lt;/a&gt; via @YouTube&lt;/li&gt;&lt;li&gt; RT @Charrues: .@muse retourne litt&#233;ralement le public de Kerampuilh ! #charrues15 Cr&#233;dit photo : @PierreHennequin &lt;a href=&#034;http://t.co/MRoC8aTetr&#034; class=&#034;spip_url spip_out auto&#034; rel=&#034;nofollow external&#034;&gt;http://t.co/MRoC8aTetr&lt;/a&gt;&lt;/li&gt;&lt;li&gt; Aux Vieilles Charrues il y avait 1,7% de chance que Muse jouent The Groove. Et ils l'ont fait PUTAIN&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Format of the results&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The results will be submitted as usual trec_eval top file results. Related to classical trec_eval top files, each event will be associated to one query/topic identifier. &lt;br class='autobr' /&gt;
Specify a format, needs to give details re: type of run, resources used, system used&#8230;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Evaluation&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The evaluation will be carried out on selected parts of the program chosen by the task organizers depending on the number of relevant tweets per event. The evaluation measures planned are recall/precision based. Several types of runs will be proposed: time-only, content-only, time&amp;content.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;How to get the data?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To get an access to the tweets, email eric_dot_sanjuan_at_univ-avignon.fr&lt;br class='autobr' /&gt;
The topics (corresponding to the programs) can be downloaded &lt;a href=&#034;http://mrim.imag.fr/User/lorraine.goeuriot/data/task3-topics.xml&#034; class=&#034;spip_out&#034; rel=&#034;external&#034;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Participants should submit up to 3 runs in the TREC format, named as follows: &lt;br class='autobr' /&gt;
&lt;TeamName&gt;_Run&lt;RunNumber&gt;.dat&lt;br class='autobr' /&gt;
One of them should be a baseline. Other runs can use any additional information.&lt;/p&gt;
&lt;p&gt;A text file should also describe the runs and give the priority order.&lt;/p&gt;
&lt;p&gt;The runs should be submitted by the 31st of May. The submission website is TBD.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Contact Information&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If you have any question, email us: lorraine_dot_goeuriot_at_imag.fr and philippe_dot_mulhem_at_imag.fr&lt;/p&gt;&lt;/div&gt;
		
		</content:encoded>


		

	</item>
<item xml:lang="en">
		<title>Microlog Data Set</title>
		<link>https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=4</link>
		<guid isPermaLink="true">https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=4</guid>
		<dc:date>2015-11-02T08:08:38Z</dc:date>
		<dc:format>text/html</dc:format>
		<dc:language>en</dc:language>
		<dc:creator>sanjuan</dc:creator>


		<dc:subject>data</dc:subject>

		<description>
&lt;p&gt;The document collection provided by GAFES project consists a pool of more than 70M unique microblogs from different sources with their meta-information and expanded URLs on a MySQL server. Due to legal terms the access to this database is restricted to registered participants under privacy agreement. &lt;br class='autobr' /&gt;
Along with the microblog corpus, a clean simplified xml dump of wikipedia easy to index and to process with state of the art NLP tools is made available to participants. Ground truth (&#8230;)&lt;/p&gt;


-
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=rubrique&amp;id_rubrique=6" rel="directory"&gt;2 - MicroBlog Search&lt;/a&gt;

/ 
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=mot&amp;id_mot=2" rel="tag"&gt;data&lt;/a&gt;

		</description>


 <content:encoded>&lt;div class='rss_texte'&gt;&lt;p&gt;The document collection provided by GAFES project consists a pool of more than 70M unique microblogs from different sources with their meta-information and expanded URLs on a MySQL server. Due to legal terms the access to this database is restricted to registered participants under privacy agreement.&lt;/p&gt;
&lt;p&gt;Along with the microblog corpus, a clean simplified xml dump of wikipedia easy to index and to process with state of the art NLP tools is made available to participants. Ground truth material is the following:&lt;/p&gt;&lt;/div&gt;
		
		</content:encoded>


		

	</item>
<item xml:lang="en">
		<title>Evaluation Methodology</title>
		<link>https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=3</link>
		<guid isPermaLink="true">https://clef2018.clef-initiative.eu/mc2/spip.php?page=article&amp;id_article=3</guid>
		<dc:date>2015-11-02T07:40:36Z</dc:date>
		<dc:format>text/html</dc:format>
		<dc:language>en</dc:language>
		<dc:creator>sanjuan</dc:creator>



		<description>
&lt;p&gt;Systems will be evaluated mainly on informativeness and relevance, but readability and ergonomy will be also checked. Informativeness evaluation will rely on textual references established by experts in project GAFES, following the strict methodology &lt;br class='autobr' /&gt;
at CLEF-INEX tweet contextualization track (http://inex.mmci.uni-saarland.de/tracks/qa/). Readability and ergonomy would be carried out on the output for specific festivals based on questionnaires to be filled out by lab participants. Best (&#8230;)&lt;/p&gt;


-
&lt;a href="https://clef2018.clef-initiative.eu/mc2/spip.php?page=rubrique&amp;id_rubrique=3" rel="directory"&gt;Tasks 2017&lt;/a&gt;


		</description>


 <content:encoded>&lt;div class='rss_texte'&gt;&lt;p&gt;Systems will be evaluated mainly on informativeness and relevance, but readability and ergonomy will be also checked. Informativeness evaluation will rely on textual references established by experts in project GAFES, following the strict methodology &lt;br class='autobr' /&gt;
at CLEF-INEX tweet contextualization track (&lt;a href=&#034;http://inex.mmci.uni-saarland.de/tracks/qa/&#034; class=&#034;spip_url spip_out auto&#034; rel=&#034;nofollow external&#034;&gt;http://inex.mmci.uni-saarland.de/tracks/qa/&lt;/a&gt;). Readability and ergonomy would be carried out on the output for specific festivals based on questionnaires to be filled out by lab participants. Best systems will have the opportunity to be experimented in july 2016 for real with the support of the label French Tech Culture (&lt;a href=&#034;http://frenchculture.org/digital-cultures&#034; class=&#034;spip_url spip_out auto&#034; rel=&#034;nofollow external&#034;&gt;http://frenchculture.org/digital-cultures&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Therefore, informativeness and relevance evaluation will be automatic and reproducible while readability and ergonomy would only be available for lab participants. All systems will be required to run on a dedicated LINUX server (allowing virtual machines) provided by organizers to will have to run in real time (maximum 5s per query). Access to full micro blog data will only be authorized for applications running on this server.&lt;/p&gt;&lt;/div&gt;
		
		</content:encoded>


		
		<enclosure url="https://clef2018.clef-initiative.eu/mc2/IMG/pdf/mc2_pres-2.pdf" length="164649" type="application/pdf" />
		

	</item>



</channel>

</rss>
