The 163 Chinese News Dataset 2011

---------------------------------------------------
1. DATA DESCRIPTION

We are happy to announce that the "163 Chinese News Dataset 2011" is now available for download. The dataset contains 13703 news articles crawled from news.163.com. Each article consists of a title, a summary and a set of keyphrases. The summaries and keyphrases are manually annotated by 163.com editors. To use this data, please follow the following guidelines:
   (1) For research only.  
   (2) Do not re-distribute.  
   (3) If you decide to use this work in your publication, please cite the following paper:
	  @inproceedings{liu2011conll, 
		  author = "Zhiyuan Liu and Xinxiong Chen and Yabin Zheng and Maosong Sun", 
		  title = "Automatic Keyphrase Extraction by Bridging Vocabulary Gap", 
		  booktitle = "Proceedings of The 15th Conference on Computational Natural Language Learning (CoNLL)",
		  year =  "2011",
	  }
---------------------------------------------------

2. DATA FORMAT

The data is recorded in Json format (http://json.org/), which is easy for humans to read and write. Each article contains the following fields:
-	date: the publication date of the article;
-	summary: The short summary of the article;
-	source: The URL of the article;;
-	id: The Unique ID of the article in this dataset;
-	content: The main body of the article;
-	title: The title of the article;
-	tags: The keyphrases of the article.

Here is an example of a record for a news article:

{"date":"2010-6-12 9:16:00","summary":"ʾʿٹʼŵĵ鱨ʾ2009ȫԪĿ14%й31%ȫλ","source":"http://news.163.com/10/0612/09/68VG9IS3000146BD.html","id":"7","content":"ʿٹʼµоʾ۰Ԫռȫۼͥıȫڶ\n612ձ ǵձʿٹʼµоʾ۰ԪռȫۼͥıȫڶԼͥ㣬ȫԪܶߵĵ֮һȥİ򸻻Ŀռȫۼͥ8.8%ھþ¼µ11.4%\nüűʾȥȫĿԼ14%¼¼ΪףȫƸظΣǰˮƽȫƸ11.5%ʲģ111.5Ԫ(Լ868.9ڸԪ)ӽ2007¼Ե111.6Ԫ\nȫİԪĿﵽ1120򻧣ӵ࣬472򻧡ʱȫ򸻻Լ2007ʱĿ2008򸻻һȼԼ14%980򻧡ȥ¼°򸻻Ŀ࣬35%Ϊǵ33%й31%λ\n","title":"2009йԪ31%","timestamp":0,"resourceKey":"","userId":"","tags":["ܼ",""],"extras":""}
---------------------------------------------------

For more information, please visit: http://nlp.csai.tsinghua.edu.cn/~lzy
