当前位置:网站首页>Explanation of ideas and sharing of pre-processing procedures for 2021 US game D (with pre-processing data code)

Explanation of ideas and sharing of pre-processing procedures for 2021 US game D (with pre-processing data code)

2022-06-25 12:05:00 Halosec_ Wei

2021 ICM

problem D: The influence of music

( With pretreatment data code )MPai Communication group :715829047(q Group files have programs and videos )

【 Xiaobai is an artifact for data analysis 】
MPai WeChat official account : All souls data
MPai Communication group :715829047(q Group files have programs and videos )
MPai Official website of data science platform :www.mpaidata.com

 

Since the ancient times , Music is a part of human society , It has become an important part of cultural heritage . To understand the role of music in the human collective , We were asked to develop a way to quantify music development . When making new music , There are many factors that affect artists , Including his innate creativity , Current social or political events , Opportunities to use new instruments or tools or other personal experiences . Our goal is to understand and measure the impact of previously produced music on new music and music artists .

Some artists can list a dozen or more artists they think have an impact on their music works . You can use song features ( For example, structure , Rhythm or lyrics ) To measure the influence of other music artists . Music sometimes revolutionizes , Provide new sounds or rhythms , For example, the emergence of new genres , Or to the existing schools ( Classical, for example , popular / Rock and roll , Jazz, etc ) Re create . This may be due to a Series of small changes , The collaborative efforts of artists , A series of influential artists or changes within society .

Many songs have similar sound effects , Many artists have contributed to a major shift in the genre of music . Sometimes , These changes are due to the influence of one artist on another - - Artists . Sometimes , This change is an external event ( Such as major world events or technological advances ) In response to . By considering the inner connection of the song and its musical characteristics , We can start to capture the interaction between music artists . Maybe we can better understand the development of music in the whole society as time goes by .

Your team is integrated with collective music (ICM) The association has appointed to develop a model for measuring the impact of music .

This question requires you to examine the evolution and revolutionary trends of music artists and genres . So ,ICM For you

The team provided - - Some data sets :

1) “influence_ data": On behalf of music influencers and followers , A report provided by the artist himself , With

And the opinions of industry experts . This data contains the past 90 Mid year 5854 An artist's influencer and follower

person

2)“full_ music_ data”": Provides 16 Variables , Including musical features : Dancing ability , rhythm ,

Loudness and tone , as well as 98,340 In this song artist_ name and artist_ id. These data are used to create

Build two summary datasets , Include :

a. Change data of different music artists “data by artist",

b. Change data of different years “data by year

Be careful : The data provided in these files is a subset of a larger data set . These files are the only data you use to solve this problem .

In order to carry out this challenging project , Explore the development of music through the influence of different music artists over time ,ICM The association asks your team to answer the following questions ,:

● Use influence_ data A data set or one of them Part to build a ( Multiple ) The directional network relationship of music influence , Connect influencers to followers . Establish a model to solve the corresponding “ Music influences ” Parameters of . Explore the subset of music influence by establishing the subnet relationship of the network relationship of directional influencers , And describe this subnet relationship . Your “ Music influences ” What is embodied in this sub network relationship ?

● Using musical features full_ music_ data and / Or two summary datasets ( Including artist and year ) To develop a measurement model of music similarity . Use your measurement model to show whether artists of the same genre are more similar than those of different genres ? Compare the similarities and influences between different schools . How to distinguish a genre ? How different genres change over time ? The different schools are Whether it has anything to do with other schools ? stay .data__influence Whether the similar data in the data set report indicates that the identified influencers have indeed influenced the corresponding artists .“ Influencer ” Does it actually affect the music that followers make ? Are some musical features more than others “ appeal ”, Or they play a similar role in influencing the music of a particular artist ?

● From these data to determine whether the existence of music may mark a revolution in the development of ( Great leap ) Characteristics of ? In the network you set up Department , Which artists represent revolutionaries ( The influencers of major changes ) ?

● Analyze the influence process of a genre of music over time . Can your team identify indicators that reveal dynamic influencers , And explain how the genre or artist changes over time ?

● How does your work convey information about the cultural impact of music in terms of time or environment ? perhaps , How to identify society in your network relationship , Political or technological change ( For example, the Internet ) Influence ?

towards ICM The association writes a one page document , Explain the value of using your method to understand the impact of music through network relationships . Considering these two problems, the data set is limited to certain genres , And then for the artists that these two datasets share , How your work or solution will change with more or richer data ? It is suggested to further study music and its influence on culture .

From music , history , Social Sciences , Interdisciplinary in technology and Mathematics , diversified ICM The association looks forward to your final report . Your PDF Solution ( No more than 25 page ) Shall include :

* A one page summary

* Catalog

* Your solution

* One - Page to ICM Documents of the association

* reference

Be careful : 2021 New rules in ,ICM The competition is limited to 25 page . All aspects submitted are counted as 25 page

The limitation of : Abstract , Catalog , The main body of the solution , Images and tables , One page document , Reference list and any

appendix .

The attachment

In response to this problem , We provide the following four data files . The data file provided contains the data you apply to this question

The only data of the question .

1. influence_ _data.csv

2. full_ music_ _data.csv

3. data_ by_ _artist.csv

4. data_ by_ year.csv

Data description

1. influence_ data.csv

The data to utf-8 code , To allow special characters to be processed

influencer_ id: Unique identification number of the influencer . ( Numeric string )

influencer_ name: Names of influential artists given by followers or industry experts .( character string )

influencer_main_genre: Which best describes most of the music created by this influential artist

Schools .( If there is ) ( character string )

influencer_ active_ start: The era when this influential artist began his music career .( Integers )

follower_ id: The unique identification number of the artist of the follower . ( Numeric string )

follower_ name: The name of an artist who follows an influential artist .( character string )

follower_ main_ genre: It can best describe the genres of most of the music created by the following artists .( If there is )

( character string )

follower_ _active_ start: Follow the era when artists began their music career .( Integers )

2. full_ music_ _data.csv 3. data_ by_artist.csv 4. data_ _by_ year.csv

Spotify ( Audio player ) That's ok

artist_ name: The artist performing the repertoire .( Array )

artist_ id: influence_ data.csv The same unique identification number is provided in the file . ( Numeric string ) Musical features :

danceability: According to the rhythm , Rhythm stability , The combination of musical elements such as tempo intensity and overall regularity to measure the way the track is suitable for dancing . value 0.0 You can dance at least , and 1.0 Up to dancing .( floating points )

energy: A measure of perception of intensity and activity . value 0.0 Minimum strength / energy , and 1.0 The strongest / energy . Usually , Energetic tracks will feel fast , Loud and noisy . for example , Death metals have higher energy , Bach's Prelude score was lower . Perceptual features that contribute to this attribute include dynamic range , Perceived loudness , timbre , Attack rate and general entropy .( Floating point numbers )

valence: A measure of the musical motivation that the repertoire conveys . value 0.0 The most negative ,1.0 The most correct . High priced tracks sound more positive ( for example , happy , optimistic , Joyful ), And low priced tracks sound more negative ( for example , sad , depressed , anger ).( Floating point numbers )

tempo: The overall estimated speed of the track , In beats per minute (BPM) In units of . In musical terms , Rhythm is the speed or rhythm of a given piece of music , It comes directly from the average beat duration .( Floating point numbers )

loudness: Overall loudness , In decibels (dB) In units of . The typical range of values is -60 to 0db. The loudness value is the average of the entire track , Can be used to compare the relative loudness of a track . Loudness is the quality of sound , It's physical strength ( The amplitude ) The main psychological connection .( Floating point numbers )

mode: The mode of the track ( Primary or secondary ) The instructions of the , The scale type from which the melody content originates .Major With 1 Express ,minor by 0.

key: Estimated overall tone of the track . Integers are mapped to pitches using standard pitch class symbols . for example .0=C,1=C#/Db, 2=D, And so on . If no tone is detected , Then the value of the tone is -1.( Integers )

Types of music .

acousticness: Whether the track is a confidence measure of acoustics ( No technical enhancement or electrical amplification ). value 1.0 Indicates that the sound track is acoustic with high confidence .( Floating point numbers )

instrumentalness: Predict if the track doesn't contain voice . under these circumstances ,“Ooh” and “ah" The sound of music is regarded as an instrument . Rap or spoken English The word track is obviously “ voice ”. The closer the instrumental value is 1.0, The more likely there is no vocal content in the track . higher than 0.5 The value of is intended to represent the instrument track , But as the value approaches 1.0, More confidence .( Floating point numbers )

liveness: Detect the presence of an audience in the track . A higher activity value means that it increases the possibility of executing the track in real time . higher than 0.8 The value of will most likely indicate that the track is active .( Floating point numbers )

speechiness: Detect the presence of spoken language in the track . Voice content similar to recording ( Talk shows, for example , Audiobooks , poetry ) The more , The closer the property values are 1.0. Greater than 0.66 The track described by the value of may consist entirely of spoken language . Be situated between 0.33 To 0.66 Values between describe tracks that may contain both music and voice , Whether it's segmented or layered ( Including rap music factory . lower than 0.33 The value of is most likely to represent music and other non voice tracks .( floating points ) 7”

explicit: Detect explicit lyrics in tracks (true(1) = yes ; false (0) =no, It is not OR Unknown ).( Boolean value )

describe :

duration_ ms: The duration of the track ( In Milliseconds ).( Integers )

popularity: The popularity of this song . The value will be in 0 To 100 Between , among 100 Is the most popular value . Popularity is calculated algorithmically , And it depends to a large extent on the total number of playback times of the track and the latest time of these playback . generally speaking , Songs that are played more frequently now will have a higher reputation than songs played more frequently in the past . Repeat the track ( for example , The same track and one - The same in the album . song ) Will be evaluated independently . The popularity of artists and albums is mathematically derived from the popularity of the repertoire .( Integers )

year: The year of release .(1921 - 2020 The whole number of years )

release_ _date: The calendar date of track release is mainly yyyy-mm-dd Format , But the precision of the date may vary , Some are just yyyy give .

song_ title (censored): The name of the track .( character string ) Software has been run to remove any potentially explicit words from the song title .

count: full music_ data.csv The file represents the number of songs by a specific artist .( Integers )

原网站

版权声明
本文为[Halosec_ Wei]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202200535108741.html