当前位置:网站首页>Explanation of ideas and sharing of pre-processing procedures for 2021 US game D (with pre-processing data code)
Explanation of ideas and sharing of pre-processing procedures for 2021 US game D (with pre-processing data code)
2022-06-25 12:05:00 【Halosec_ Wei】
2021 ICM
problem D: The influence of music
( With pretreatment data code )MPai Communication group :715829047(q Group files have programs and videos )
【 Xiaobai is an artifact for data analysis 】
MPai WeChat official account : All souls data
MPai Communication group :715829047(q Group files have programs and videos )
MPai Official website of data science platform :www.mpaidata.com
Since the ancient times , Music is a part of human society , It has become an important part of cultural heritage . To understand the role of music in the human collective , We were asked to develop a way to quantify music development . When making new music , There are many factors that affect artists , Including his innate creativity , Current social or political events , Opportunities to use new instruments or tools or other personal experiences . Our goal is to understand and measure the impact of previously produced music on new music and music artists .
Some artists can list a dozen or more artists they think have an impact on their music works . You can use song features ( For example, structure , Rhythm or lyrics ) To measure the influence of other music artists . Music sometimes revolutionizes , Provide new sounds or rhythms , For example, the emergence of new genres , Or to the existing schools ( Classical, for example , popular / Rock and roll , Jazz, etc ) Re create . This may be due to a Series of small changes , The collaborative efforts of artists , A series of influential artists or changes within society .
Many songs have similar sound effects , Many artists have contributed to a major shift in the genre of music . Sometimes , These changes are due to the influence of one artist on another - - Artists . Sometimes , This change is an external event ( Such as major world events or technological advances ) In response to . By considering the inner connection of the song and its musical characteristics , We can start to capture the interaction between music artists . Maybe we can better understand the development of music in the whole society as time goes by .
Your team is integrated with collective music (ICM) The association has appointed to develop a model for measuring the impact of music .
This question requires you to examine the evolution and revolutionary trends of music artists and genres . So ,ICM For you
The team provided - - Some data sets :
1) “influence_ data": On behalf of music influencers and followers , A report provided by the artist himself , With
And the opinions of industry experts . This data contains the past 90 Mid year 5854 An artist's influencer and follower
person
2)“full_ music_ data”": Provides 16 Variables , Including musical features : Dancing ability , rhythm ,
Loudness and tone , as well as 98,340 In this song artist_ name and artist_ id. These data are used to create
Build two summary datasets , Include :
a. Change data of different music artists “data by artist",
b. Change data of different years “data by year”
Be careful : The data provided in these files is a subset of a larger data set . These files are the only data you use to solve this problem .
In order to carry out this challenging project , Explore the development of music through the influence of different music artists over time ,ICM The association asks your team to answer the following questions ,:
● Use influence_ data A data set or one of them Part to build a ( Multiple ) The directional network relationship of music influence , Connect influencers to followers . Establish a model to solve the corresponding “ Music influences ” Parameters of . Explore the subset of music influence by establishing the subnet relationship of the network relationship of directional influencers , And describe this subnet relationship . Your “ Music influences ” What is embodied in this sub network relationship ?
● Using musical features full_ music_ data and / Or two summary datasets ( Including artist and year ) To develop a measurement model of music similarity . Use your measurement model to show whether artists of the same genre are more similar than those of different genres ? Compare the similarities and influences between different schools . How to distinguish a genre ? How different genres change over time ? The different schools are Whether it has anything to do with other schools ? stay .data__influence Whether the similar data in the data set report indicates that the identified influencers have indeed influenced the corresponding artists .“ Influencer ” Does it actually affect the music that followers make ? Are some musical features more than others “ appeal ”, Or they play a similar role in influencing the music of a particular artist ?
● From these data to determine whether the existence of music may mark a revolution in the development of ( Great leap ) Characteristics of ? In the network you set up Department , Which artists represent revolutionaries ( The influencers of major changes ) ?
● Analyze the influence process of a genre of music over time . Can your team identify indicators that reveal dynamic influencers , And explain how the genre or artist changes over time ?
● How does your work convey information about the cultural impact of music in terms of time or environment ? perhaps , How to identify society in your network relationship , Political or technological change ( For example, the Internet ) Influence ?
towards ICM The association writes a one page document , Explain the value of using your method to understand the impact of music through network relationships . Considering these two problems, the data set is limited to certain genres , And then for the artists that these two datasets share , How your work or solution will change with more or richer data ? It is suggested to further study music and its influence on culture .
From music , history , Social Sciences , Interdisciplinary in technology and Mathematics , diversified ICM The association looks forward to your final report . Your PDF Solution ( No more than 25 page ) Shall include :
* A one page summary
* Catalog
* Your solution
* One - Page to ICM Documents of the association
* reference
Be careful : 2021 New rules in ,ICM The competition is limited to 25 page . All aspects submitted are counted as 25 page
The limitation of : Abstract , Catalog , The main body of the solution , Images and tables , One page document , Reference list and any
appendix .
The attachment
In response to this problem , We provide the following four data files . The data file provided contains the data you apply to this question
The only data of the question .
1. influence_ _data.csv
2. full_ music_ _data.csv
3. data_ by_ _artist.csv
4. data_ by_ year.csv
Data description
1. influence_ data.csv
The data to utf-8 code , To allow special characters to be processed
influencer_ id: Unique identification number of the influencer . ( Numeric string )
influencer_ name: Names of influential artists given by followers or industry experts .( character string )
influencer_main_genre: Which best describes most of the music created by this influential artist
Schools .( If there is ) ( character string )
influencer_ active_ start: The era when this influential artist began his music career .( Integers )
follower_ id: The unique identification number of the artist of the follower . ( Numeric string )
follower_ name: The name of an artist who follows an influential artist .( character string )
follower_ main_ genre: It can best describe the genres of most of the music created by the following artists .( If there is )
( character string )
follower_ _active_ start: Follow the era when artists began their music career .( Integers )
2. full_ music_ _data.csv 3. data_ by_artist.csv 4. data_ _by_ year.csv
Spotify ( Audio player ) That's ok
artist_ name: The artist performing the repertoire .( Array )
artist_ id: influence_ data.csv The same unique identification number is provided in the file . ( Numeric string ) Musical features :
danceability: According to the rhythm , Rhythm stability , The combination of musical elements such as tempo intensity and overall regularity to measure the way the track is suitable for dancing . value 0.0 You can dance at least , and 1.0 Up to dancing .( floating points )
energy: A measure of perception of intensity and activity . value 0.0 Minimum strength / energy , and 1.0 The strongest / energy . Usually , Energetic tracks will feel fast , Loud and noisy . for example , Death metals have higher energy , Bach's Prelude score was lower . Perceptual features that contribute to this attribute include dynamic range , Perceived loudness , timbre , Attack rate and general entropy .( Floating point numbers )
valence: A measure of the musical motivation that the repertoire conveys . value 0.0 The most negative ,1.0 The most correct . High priced tracks sound more positive ( for example , happy , optimistic , Joyful ), And low priced tracks sound more negative ( for example , sad , depressed , anger ).( Floating point numbers )
tempo: The overall estimated speed of the track , In beats per minute (BPM) In units of . In musical terms , Rhythm is the speed or rhythm of a given piece of music , It comes directly from the average beat duration .( Floating point numbers )
loudness: Overall loudness , In decibels (dB) In units of . The typical range of values is -60 to 0db. The loudness value is the average of the entire track , Can be used to compare the relative loudness of a track . Loudness is the quality of sound , It's physical strength ( The amplitude ) The main psychological connection .( Floating point numbers )
mode: The mode of the track ( Primary or secondary ) The instructions of the , The scale type from which the melody content originates .Major With 1 Express ,minor by 0.
key: Estimated overall tone of the track . Integers are mapped to pitches using standard pitch class symbols . for example .0=C,1=C#/Db, 2=D, And so on . If no tone is detected , Then the value of the tone is -1.( Integers )
Types of music .
acousticness: Whether the track is a confidence measure of acoustics ( No technical enhancement or electrical amplification ). value 1.0 Indicates that the sound track is acoustic with high confidence .( Floating point numbers )
instrumentalness: Predict if the track doesn't contain voice . under these circumstances ,“Ooh” and “ah" The sound of music is regarded as an instrument . Rap or spoken English The word track is obviously “ voice ”. The closer the instrumental value is 1.0, The more likely there is no vocal content in the track . higher than 0.5 The value of is intended to represent the instrument track , But as the value approaches 1.0, More confidence .( Floating point numbers )
liveness: Detect the presence of an audience in the track . A higher activity value means that it increases the possibility of executing the track in real time . higher than 0.8 The value of will most likely indicate that the track is active .( Floating point numbers )
speechiness: Detect the presence of spoken language in the track . Voice content similar to recording ( Talk shows, for example , Audiobooks , poetry ) The more , The closer the property values are 1.0. Greater than 0.66 The track described by the value of may consist entirely of spoken language . Be situated between 0.33 To 0.66 Values between describe tracks that may contain both music and voice , Whether it's segmented or layered ( Including rap music factory . lower than 0.33 The value of is most likely to represent music and other non voice tracks .( floating points ) 7”
explicit: Detect explicit lyrics in tracks (true(1) = yes ; false (0) =no, It is not OR Unknown ).( Boolean value )
describe :
duration_ ms: The duration of the track ( In Milliseconds ).( Integers )
popularity: The popularity of this song . The value will be in 0 To 100 Between , among 100 Is the most popular value . Popularity is calculated algorithmically , And it depends to a large extent on the total number of playback times of the track and the latest time of these playback . generally speaking , Songs that are played more frequently now will have a higher reputation than songs played more frequently in the past . Repeat the track ( for example , The same track and one - The same in the album . song ) Will be evaluated independently . The popularity of artists and albums is mathematically derived from the popularity of the repertoire .( Integers )
year: The year of release .(1921 - 2020 The whole number of years )
release_ _date: The calendar date of track release is mainly yyyy-mm-dd Format , But the precision of the date may vary , Some are just yyyy give .
song_ title (censored): The name of the track .( character string ) Software has been run to remove any potentially explicit words from the song title .
count: full music_ data.csv The file represents the number of songs by a specific artist .( Integers )
边栏推荐
- 客从何处来
- Idea local launch Flink task
- JS judge whether a number is in the set
- 15、wpf之button样式小记
- Cesium draw point line surface
- ThingsPanel 发布物联网手机客户端(多图)
- Why can't you Ping the website but you can access it?
- confluence7.4. X upgrade record
- Caused by: org. xml. sax. SAXParseException; lineNumber: 1; columnNumber: 10; Processing matching '[xx][mm][ll]' is not allowed
- MYSQL中对复杂JSON的更新
猜你喜欢
Windows11 MySQL service is missing
黑马畅购商城---1.项目介绍-环境搭建
优品购电商3.0微服务商城项目实战小结
Countdownlatch source code analysis
How TCP handles exceptions during three handshakes and four waves
VFP develops a official account to receive coupons, and users will jump to various target pages after registration, and a set of standard processes will be sent to you
Redis雪崩、穿透和击穿是什么?
Actual combat summary of Youpin e-commerce 3.0 micro Service Mall project
Black Horse Chang Shopping Mall - - - 3. Gestion des produits de base
Two ways of redis persistence -- detailed explanation of RDB and AOF
随机推荐
学习笔记 2022 综述 | 自动图机器学习,阐述 AGML 方法、库与方向
Startups must survive
为什么ping不通网站 但是却可以访问该网站?
Detailed explanation of Flink checkpoint specific operation process and summary of error reporting and debugging methods
交易期货沪镍产品网上怎么开户
Using DBF of VFP to web salary query system
9 cases where elements cannot be located
The cloud native data lake has passed the evaluation and certification of the ICT Institute with its storage, computing, data management and other capabilities
什么是Flink?Flink能用来做什么?
A detour taken by a hardware engineer
Use of JSP sessionscope domain
Two ways of redis persistence -- detailed explanation of RDB and AOF
客从何处来
Caused by: org. xml. sax. SAXParseException; lineNumber: 1; columnNumber: 10; Processing matching '[xx][mm][ll]' is not allowed
How far is it from the DBF of VFP to the web salary query system?
WebRTC Native M96 基础Base模块介绍之实用方法的封装(MD5、Base64、时间、随机数)
R语言使用epiDisplay包的followup.plot函数可视化多个ID(病例)监测指标的纵向随访图、使用stress.type参数指定强调线的id子集的线条的类型(type)
What are redis avalanche, penetration and breakdown?
做自媒体视频需要怎么做才能年收入一百万?
Database Series: MySQL index optimization summary (comprehensive version)