CMSC320 FINAL PROJECT¶

Metal Bands Around The World¶

Name: Yan-Jen Lo UID: 114204813 Due Date: 12/16/17 @ 1:30 PM

Introduction
Goal
Part I: Web Scrapping
Part II: Data Analysis & Visualization
Conclusion
Sources

Introduction¶

Inspiration: Kaggle's Metal by Nation and Genre Data¶

While looking around for a data set to use, I stumbled upon this kaggle entry on metal bands. Being a metal head since middle school, this peaked my interest like no other data set I've seen so far. After downloading the two data sets from here:

metal_bands_2017.csv
world_population_1960_2015.csv

The dataset had only 5000 metal bands and the world population data didn't go up to 2017. Also if you check out this kaggle notebook, someone has already analyzed the data. I wanted to see for myself if the analysis for 5000 bands still holds for this data which is more than 20x larger.

Data Sources: Encyclopedia Metallum & Wikipedia¶

Data	Link	Stats	Date Accessed
Encyclopedia Metallum	Metal-Archives	119474 Bands	12/5/17
Wikipedia	Population Data	240 Countries	12/15/17

When I use to look up bands, I used the Encyclopedia Metallum and the database had just about every obscure band you could find. Since they let users contribute to the databse (the acceptance process is heavily monitored), small bands can put themselves into the databse. Instead of using the kaggle data, I decided to scrape the metal archives myself since I felt that 5000 bands was lacking in terms of analysis for the entire world. For population data, I opted for Wikipedia since most of their population information was based on a real time world population clock.

Goal: Analysis With Extensive Visualization (Graphs & Interactive World Map)¶

QUESTIONS:¶

Which genre type seems to dominate the metal playing field?
Which country is the most metal?

PART I: Web Scrapping¶

Scrape All Countries

HTTP Request
BeautifulSoup
Query All Countries With HTTP Request
Execute Query Code & Write To TSV
Load In Our TSV

Scrape Population Data

Wikipedia HTML
Load In Population CSV

Scrap All Countries¶

import time
from multiprocessing  import Pool
import requests
from bs4 import BeautifulSoup
import html5lib
import json

import pandas as pd
import numpy as np

import csv
import re

HTTP Request¶

To scrap the countries page of Encyclopedia Metallum, we need to make a HTTP request to the URL. If the request was successful and a connection was created, the response should be:

<Response [200]>

enyclopaedia_metallum_country_url = 'https://www.metal-archives.com/browse/country'
countries_request = requests.get(enyclopaedia_metallum_country_url)
print(countries_request)

<Response [200]>

BeautifulSoup¶

To convert the request response into usable HTML tags easily, BeautifulSoup is the way to go. Make sure you have these imported:

from bs4 import BeautifulSoup
import html5lib

The html5lib library helps convert the response to HTML, thus allowing the use of .find_all(tag). Your goal is to look at the HTML and figure which tags have the information you want and specifically target that tag to filter out as much as possible.

In this case, the columns of countries starts with the tag:

<div class="countryCol">

Common .find_all Hacks¶

1) href or class field tag

<a href="https://www.metal-archives.com/lists/AF">Afghanistan"</a>
<div class="countryCol">

.find_all('a', href = True)
.find_all('div', class_ = True)

2) Specify field specifics with dictionaries

<div class="clear loading">

.find_all('a', {'class': 'clear loading'})

soup = BeautifulSoup(countries_request.content, 'html5lib')
countries_soup = soup.find_all('div',{'class': 'countryCol'})
countries_soup

[<div class="countryCol">
 											<a href="https://www.metal-archives.com/lists/AF">Afghanistan</a><br/>
 											<a href="https://www.metal-archives.com/lists/AX">Åland Islands</a><br/>
 											<a href="https://www.metal-archives.com/lists/AL">Albania</a><br/>
 											<a href="https://www.metal-archives.com/lists/DZ">Algeria</a><br/>
 											<a href="https://www.metal-archives.com/lists/AD">Andorra</a><br/>
 											<a href="https://www.metal-archives.com/lists/AO">Angola</a><br/>
 											<a href="https://www.metal-archives.com/lists/AR">Argentina</a><br/>
 											<a href="https://www.metal-archives.com/lists/AM">Armenia</a><br/>
 											<a href="https://www.metal-archives.com/lists/AW">Aruba</a><br/>
 											<a href="https://www.metal-archives.com/lists/AU">Australia</a><br/>
 											<a href="https://www.metal-archives.com/lists/AT">Austria</a><br/>
 											<a href="https://www.metal-archives.com/lists/AZ">Azerbaijan</a><br/>
 											<a href="https://www.metal-archives.com/lists/BH">Bahrain</a><br/>
 											<a href="https://www.metal-archives.com/lists/BD">Bangladesh</a><br/>
 											<a href="https://www.metal-archives.com/lists/BB">Barbados</a><br/>
 											<a href="https://www.metal-archives.com/lists/BY">Belarus</a><br/>
 											<a href="https://www.metal-archives.com/lists/BE">Belgium</a><br/>
 											<a href="https://www.metal-archives.com/lists/BZ">Belize</a><br/>
 											<a href="https://www.metal-archives.com/lists/BO">Bolivia</a><br/>
 											<a href="https://www.metal-archives.com/lists/BA">Bosnia and Herzegovina</a><br/>
 											<a href="https://www.metal-archives.com/lists/BW">Botswana</a><br/>
 											<a href="https://www.metal-archives.com/lists/BR">Brazil</a><br/>
 											<a href="https://www.metal-archives.com/lists/BN">Brunei</a><br/>
 											<a href="https://www.metal-archives.com/lists/BG">Bulgaria</a><br/>
 											<a href="https://www.metal-archives.com/lists/KH">Cambodia</a><br/>
 											<a href="https://www.metal-archives.com/lists/CA">Canada</a><br/>
 											<a href="https://www.metal-archives.com/lists/CL">Chile</a><br/>
 											<a href="https://www.metal-archives.com/lists/CN">China</a><br/>
 											<a href="https://www.metal-archives.com/lists/CO">Colombia</a><br/>
 											<a href="https://www.metal-archives.com/lists/CR">Costa Rica</a><br/>
 											<a href="https://www.metal-archives.com/lists/HR">Croatia</a><br/>
 											<a href="https://www.metal-archives.com/lists/CU">Cuba</a><br/>
 											<a href="https://www.metal-archives.com/lists/CW">Curaçao</a><br/>
 											<a href="https://www.metal-archives.com/lists/CY">Cyprus</a><br/>
 											<a href="https://www.metal-archives.com/lists/CZ">Czech Republic</a><br/>
 											<a href="https://www.metal-archives.com/lists/DK">Denmark</a><br/>
 											<a href="https://www.metal-archives.com/lists/DO">Dominican Republic</a><br/>
 											<a href="https://www.metal-archives.com/lists/EC">Ecuador</a><br/>
 											<a href="https://www.metal-archives.com/lists/EG">Egypt</a><br/>
 											<a href="https://www.metal-archives.com/lists/SV">El Salvador</a><br/>
 											<a href="https://www.metal-archives.com/lists/EE">Estonia</a><br/>
 											<a href="https://www.metal-archives.com/lists/ET">Ethiopia</a><br/>
 											<a href="https://www.metal-archives.com/lists/FO">Faroe Islands</a><br/>
 											<a href="https://www.metal-archives.com/lists/FI">Finland</a><br/>
 											<a href="https://www.metal-archives.com/lists/FR">France</a><br/>
 											<a href="https://www.metal-archives.com/lists/PF">French Polynesia</a><br/>
 											<a href="https://www.metal-archives.com/lists/GE">Georgia</a><br/>
 											<a href="https://www.metal-archives.com/lists/DE">Germany</a><br/>
 											<a href="https://www.metal-archives.com/lists/GI">Gibraltar</a><br/>
 											<a href="https://www.metal-archives.com/lists/GR">Greece</a><br/>
 									</div>, <div class="countryCol">
 											<a href="https://www.metal-archives.com/lists/GL">Greenland</a><br/>
 											<a href="https://www.metal-archives.com/lists/GU">Guam</a><br/>
 											<a href="https://www.metal-archives.com/lists/GT">Guatemala</a><br/>
 											<a href="https://www.metal-archives.com/lists/GG">Guernsey</a><br/>
 											<a href="https://www.metal-archives.com/lists/GY">Guyana</a><br/>
 											<a href="https://www.metal-archives.com/lists/HN">Honduras</a><br/>
 											<a href="https://www.metal-archives.com/lists/HU">Hungary</a><br/>
 											<a href="https://www.metal-archives.com/lists/IS">Iceland</a><br/>
 											<a href="https://www.metal-archives.com/lists/IN">India</a><br/>
 											<a href="https://www.metal-archives.com/lists/ID">Indonesia</a><br/>
 											<a href="https://www.metal-archives.com/lists/XX">International</a><br/>
 											<a href="https://www.metal-archives.com/lists/IR">Iran</a><br/>
 											<a href="https://www.metal-archives.com/lists/IQ">Iraq</a><br/>
 											<a href="https://www.metal-archives.com/lists/IE">Ireland</a><br/>
 											<a href="https://www.metal-archives.com/lists/IM">Isle of Man</a><br/>
 											<a href="https://www.metal-archives.com/lists/IL">Israel</a><br/>
 											<a href="https://www.metal-archives.com/lists/IT">Italy</a><br/>
 											<a href="https://www.metal-archives.com/lists/JM">Jamaica</a><br/>
 											<a href="https://www.metal-archives.com/lists/JP">Japan</a><br/>
 											<a href="https://www.metal-archives.com/lists/JE">Jersey</a><br/>
 											<a href="https://www.metal-archives.com/lists/JO">Jordan</a><br/>
 											<a href="https://www.metal-archives.com/lists/KZ">Kazakhstan</a><br/>
 											<a href="https://www.metal-archives.com/lists/KE">Kenya</a><br/>
 											<a href="https://www.metal-archives.com/lists/KR">Korea, South</a><br/>
 											<a href="https://www.metal-archives.com/lists/KW">Kuwait</a><br/>
 											<a href="https://www.metal-archives.com/lists/KG">Kyrgyzstan</a><br/>
 											<a href="https://www.metal-archives.com/lists/LA">Laos</a><br/>
 											<a href="https://www.metal-archives.com/lists/LV">Latvia</a><br/>
 											<a href="https://www.metal-archives.com/lists/LB">Lebanon</a><br/>
 											<a href="https://www.metal-archives.com/lists/LY">Libya</a><br/>
 											<a href="https://www.metal-archives.com/lists/LI">Liechtenstein</a><br/>
 											<a href="https://www.metal-archives.com/lists/LT">Lithuania</a><br/>
 											<a href="https://www.metal-archives.com/lists/LU">Luxembourg</a><br/>
 											<a href="https://www.metal-archives.com/lists/MK">Macedonia (FYROM)</a><br/>
 											<a href="https://www.metal-archives.com/lists/MG">Madagascar</a><br/>
 											<a href="https://www.metal-archives.com/lists/MY">Malaysia</a><br/>
 											<a href="https://www.metal-archives.com/lists/MV">Maldives</a><br/>
 											<a href="https://www.metal-archives.com/lists/MT">Malta</a><br/>
 											<a href="https://www.metal-archives.com/lists/MU">Mauritius</a><br/>
 											<a href="https://www.metal-archives.com/lists/MX">Mexico</a><br/>
 											<a href="https://www.metal-archives.com/lists/MD">Moldova</a><br/>
 											<a href="https://www.metal-archives.com/lists/MC">Monaco</a><br/>
 											<a href="https://www.metal-archives.com/lists/MN">Mongolia</a><br/>
 											<a href="https://www.metal-archives.com/lists/ME">Montenegro</a><br/>
 											<a href="https://www.metal-archives.com/lists/MA">Morocco</a><br/>
 											<a href="https://www.metal-archives.com/lists/MZ">Mozambique</a><br/>
 											<a href="https://www.metal-archives.com/lists/MM">Myanmar</a><br/>
 											<a href="https://www.metal-archives.com/lists/NA">Namibia</a><br/>
 											<a href="https://www.metal-archives.com/lists/NP">Nepal</a><br/>
 											<a href="https://www.metal-archives.com/lists/NL">Netherlands</a><br/>
 									</div>, <div class="countryCol">
 											<a href="https://www.metal-archives.com/lists/NC">New Caledonia</a><br/>
 											<a href="https://www.metal-archives.com/lists/NZ">New Zealand</a><br/>
 											<a href="https://www.metal-archives.com/lists/NI">Nicaragua</a><br/>
 											<a href="https://www.metal-archives.com/lists/NO">Norway</a><br/>
 											<a href="https://www.metal-archives.com/lists/OM">Oman</a><br/>
 											<a href="https://www.metal-archives.com/lists/PK">Pakistan</a><br/>
 											<a href="https://www.metal-archives.com/lists/PA">Panama</a><br/>
 											<a href="https://www.metal-archives.com/lists/PY">Paraguay</a><br/>
 											<a href="https://www.metal-archives.com/lists/PE">Peru</a><br/>
 											<a href="https://www.metal-archives.com/lists/PH">Philippines</a><br/>
 											<a href="https://www.metal-archives.com/lists/PL">Poland</a><br/>
 											<a href="https://www.metal-archives.com/lists/PT">Portugal</a><br/>
 											<a href="https://www.metal-archives.com/lists/PR">Puerto Rico</a><br/>
 											<a href="https://www.metal-archives.com/lists/QA">Qatar</a><br/>
 											<a href="https://www.metal-archives.com/lists/RE">Reunion</a><br/>
 											<a href="https://www.metal-archives.com/lists/RO">Romania</a><br/>
 											<a href="https://www.metal-archives.com/lists/RU">Russia</a><br/>
 											<a href="https://www.metal-archives.com/lists/SM">San Marino</a><br/>
 											<a href="https://www.metal-archives.com/lists/SA">Saudi Arabia</a><br/>
 											<a href="https://www.metal-archives.com/lists/RS">Serbia</a><br/>
 											<a href="https://www.metal-archives.com/lists/SG">Singapore</a><br/>
 											<a href="https://www.metal-archives.com/lists/SK">Slovakia</a><br/>
 											<a href="https://www.metal-archives.com/lists/SI">Slovenia</a><br/>
 											<a href="https://www.metal-archives.com/lists/ZA">South Africa</a><br/>
 											<a href="https://www.metal-archives.com/lists/ES">Spain</a><br/>
 											<a href="https://www.metal-archives.com/lists/LK">Sri Lanka</a><br/>
 											<a href="https://www.metal-archives.com/lists/SR">Suriname</a><br/>
 											<a href="https://www.metal-archives.com/lists/SJ">Svalbard</a><br/>
 											<a href="https://www.metal-archives.com/lists/SE">Sweden</a><br/>
 											<a href="https://www.metal-archives.com/lists/CH">Switzerland</a><br/>
 											<a href="https://www.metal-archives.com/lists/SY">Syria</a><br/>
 											<a href="https://www.metal-archives.com/lists/TW">Taiwan</a><br/>
 											<a href="https://www.metal-archives.com/lists/TJ">Tajikistan</a><br/>
 											<a href="https://www.metal-archives.com/lists/TH">Thailand</a><br/>
 											<a href="https://www.metal-archives.com/lists/TT">Trinidad and Tobago</a><br/>
 											<a href="https://www.metal-archives.com/lists/TN">Tunisia</a><br/>
 											<a href="https://www.metal-archives.com/lists/TR">Turkey</a><br/>
 											<a href="https://www.metal-archives.com/lists/TM">Turkmenistan</a><br/>
 											<a href="https://www.metal-archives.com/lists/UG">Uganda</a><br/>
 											<a href="https://www.metal-archives.com/lists/UA">Ukraine</a><br/>
 											<a href="https://www.metal-archives.com/lists/AE">United Arab Emirates</a><br/>
 											<a href="https://www.metal-archives.com/lists/GB">United Kingdom</a><br/>
 											<a href="https://www.metal-archives.com/lists/US">United States</a><br/>
 											<a href="https://www.metal-archives.com/lists/ZZ">Unknown</a><br/>
 											<a href="https://www.metal-archives.com/lists/UY">Uruguay</a><br/>
 											<a href="https://www.metal-archives.com/lists/UZ">Uzbekistan</a><br/>
 											<a href="https://www.metal-archives.com/lists/VE">Venezuela</a><br/>
 											<a href="https://www.metal-archives.com/lists/VN">Vietnam</a><br/>
 									</div>]

As you can see from above, all the countries are in this tag format:

<a href="https://www.metal-archives.com/lists/(COUNTRY ID)">(COUNTRY)</a>

I simply extracted the countries out by looping through every country tag and using a regular expression's (regex) .groups() capabilities. Don't forget to import the regex library before use. Regexes are pretty useful, take a look here for official documentation.

import re
re.compile('<a href="https://www.metal-archives.com\/lists\/(.+)">(.+)</a>')

With the parenthesis around the parts where I want to extract information, .groups() will return a tuple in the form of (COUNTRY ID, COUNTRY).

# Create country dictionary where country is the key and the country id is the value
all_countries = {}
country_regex = re.compile('<a href="https://www.metal-archives.com\/lists\/(.+)">(.+)</a>')

for col in range(len(countries_soup)):
    country_a_href = countries_soup[col].find_all('a') # Further filtering tags to only country tags
    for a_href in range(len(country_a_href)):
        tag = str(country_a_href[a_href]).strip()
        matched = country_regex.match(tag).groups() # (ID, country)
        all_countries[matched[0]] = (matched[1])
        
all_countries

{'AD': 'Andorra',
 'AE': 'United Arab Emirates',
 'AF': 'Afghanistan',
 'AL': 'Albania',
 'AM': 'Armenia',
 'AO': 'Angola',
 'AR': 'Argentina',
 'AT': 'Austria',
 'AU': 'Australia',
 'AW': 'Aruba',
 'AX': 'Åland Islands',
 'AZ': 'Azerbaijan',
 'BA': 'Bosnia and Herzegovina',
 'BB': 'Barbados',
 'BD': 'Bangladesh',
 'BE': 'Belgium',
 'BG': 'Bulgaria',
 'BH': 'Bahrain',
 'BN': 'Brunei',
 'BO': 'Bolivia',
 'BR': 'Brazil',
 'BW': 'Botswana',
 'BY': 'Belarus',
 'BZ': 'Belize',
 'CA': 'Canada',
 'CH': 'Switzerland',
 'CL': 'Chile',
 'CN': 'China',
 'CO': 'Colombia',
 'CR': 'Costa Rica',
 'CU': 'Cuba',
 'CW': 'Curaçao',
 'CY': 'Cyprus',
 'CZ': 'Czech Republic',
 'DE': 'Germany',
 'DK': 'Denmark',
 'DO': 'Dominican Republic',
 'DZ': 'Algeria',
 'EC': 'Ecuador',
 'EE': 'Estonia',
 'EG': 'Egypt',
 'ES': 'Spain',
 'ET': 'Ethiopia',
 'FI': 'Finland',
 'FO': 'Faroe Islands',
 'FR': 'France',
 'GB': 'United Kingdom',
 'GE': 'Georgia',
 'GG': 'Guernsey',
 'GI': 'Gibraltar',
 'GL': 'Greenland',
 'GR': 'Greece',
 'GT': 'Guatemala',
 'GU': 'Guam',
 'GY': 'Guyana',
 'HN': 'Honduras',
 'HR': 'Croatia',
 'HU': 'Hungary',
 'ID': 'Indonesia',
 'IE': 'Ireland',
 'IL': 'Israel',
 'IM': 'Isle of Man',
 'IN': 'India',
 'IQ': 'Iraq',
 'IR': 'Iran',
 'IS': 'Iceland',
 'IT': 'Italy',
 'JE': 'Jersey',
 'JM': 'Jamaica',
 'JO': 'Jordan',
 'JP': 'Japan',
 'KE': 'Kenya',
 'KG': 'Kyrgyzstan',
 'KH': 'Cambodia',
 'KR': 'Korea, South',
 'KW': 'Kuwait',
 'KZ': 'Kazakhstan',
 'LA': 'Laos',
 'LB': 'Lebanon',
 'LI': 'Liechtenstein',
 'LK': 'Sri Lanka',
 'LT': 'Lithuania',
 'LU': 'Luxembourg',
 'LV': 'Latvia',
 'LY': 'Libya',
 'MA': 'Morocco',
 'MC': 'Monaco',
 'MD': 'Moldova',
 'ME': 'Montenegro',
 'MG': 'Madagascar',
 'MK': 'Macedonia (FYROM)',
 'MM': 'Myanmar',
 'MN': 'Mongolia',
 'MT': 'Malta',
 'MU': 'Mauritius',
 'MV': 'Maldives',
 'MX': 'Mexico',
 'MY': 'Malaysia',
 'MZ': 'Mozambique',
 'NA': 'Namibia',
 'NC': 'New Caledonia',
 'NI': 'Nicaragua',
 'NL': 'Netherlands',
 'NO': 'Norway',
 'NP': 'Nepal',
 'NZ': 'New Zealand',
 'OM': 'Oman',
 'PA': 'Panama',
 'PE': 'Peru',
 'PF': 'French Polynesia',
 'PH': 'Philippines',
 'PK': 'Pakistan',
 'PL': 'Poland',
 'PR': 'Puerto Rico',
 'PT': 'Portugal',
 'PY': 'Paraguay',
 'QA': 'Qatar',
 'RE': 'Reunion',
 'RO': 'Romania',
 'RS': 'Serbia',
 'RU': 'Russia',
 'SA': 'Saudi Arabia',
 'SE': 'Sweden',
 'SG': 'Singapore',
 'SI': 'Slovenia',
 'SJ': 'Svalbard',
 'SK': 'Slovakia',
 'SM': 'San Marino',
 'SR': 'Suriname',
 'SV': 'El Salvador',
 'SY': 'Syria',
 'TH': 'Thailand',
 'TJ': 'Tajikistan',
 'TM': 'Turkmenistan',
 'TN': 'Tunisia',
 'TR': 'Turkey',
 'TT': 'Trinidad and Tobago',
 'TW': 'Taiwan',
 'UA': 'Ukraine',
 'UG': 'Uganda',
 'US': 'United States',
 'UY': 'Uruguay',
 'UZ': 'Uzbekistan',
 'VE': 'Venezuela',
 'VN': 'Vietnam',
 'XX': 'International',
 'ZA': 'South Africa',
 'ZZ': 'Unknown'}

Query All Countries With HTTP Request¶

This part is somewhat complex because of the way the Encyclopedia Metallum is made. When you click on a country, you will see all the bands for that specific country.

Here are the 3 main difficulties with scraping this part of the database:¶

If you try to BeautifulSoup the HTML code, you won't see any of the listed bands because the webpage is a JavaScript rendered webpage. As you can see with the inspect element I captured through Internet Explorer (IE).
If the country has more than 500 bands total, they are displayed by page number in batches of 500.
If you try to scrap the same website a lot, most websites can detect it, and thus disconnect your HTTP connection.

Sources Used:¶

1) Source for deriving solutions to #1 and #2: jonchar on GitHub wrote his own scraper for the Encyclopedia Metallum.

2) Solution to number #3 was from this Stack Overflow post and links to HTTP headers are here.

Solution to Problem #1: AJAX Call With HTTP Requests¶

You can see the JavaScript command to the AJAX Databse (this is where all their data on bands is stored). To specify which country you want to see, you simply put in the country's ID number, which we have extracted earlier.

var grid = createGrid("#bandListCountry", 500, 'browse/ajax-country/c/(COUNTRY ID)/json/1/', {aoColumns: [ null, null, null, { bSortable: false, sWidth: '80px'} ] });

Instead of doing an HTTP Request like usual, we have to add the AJAX call to the end of the URL. The AJAX database doesn't return HTML, it gives you data in JSON form, so essentially Python dictionary form. Make sure you have the JSON library imported.

import json

Solution to Problem #2: Parameters & Payload With HTTP Requests¶

When a country has more than 500 bands, there will be more than one page, therefore we use the params part of HTTP requests and give it a payload. Look at jonchar's README.md for detail explanation of payload. Basically, to get the first page, you want:

payload = {'sEcho': 0, 'iDisplayStart': 0, 'iDisplayLength': 500}

For every consecutive page, you increase both start and end by 500. You want to do this until you've reached the last page.

Solution to Problem #3: Headers & User-Agents With HTTP Requests¶

Since we are querying one country (there's around 140 countries listed) at a time and each page at a time, we'll make maybe close to 200 requests to Encyclopedia Metallum. To prevent the website from disconnecting our HTTP request, we have to pretend to be browser accessing the page. The best way is to give the request a User-Agent along with the request. With the header below, I'm pretending to be Firefox:

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1'}

# Helper function that returns all the bands listed for 
# a specific country and the specific page requested
def get_page_query(country, CID, start, end):
    
    # Full AJAX database call components
    URL = 'http://www.metal-archives.com'
    AJAX_REQUEST_BEGIN = '/browse/ajax-country/c/'
    AJAX_REQUEST_END = '/json/1/'
    
    payload = {'sEcho': 0,
                'iDisplayStart': start,
                'iDisplayLength': end} 
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1'}
    
    # Request with payload and headers
    r = requests.get(URL + AJAX_REQUEST_BEGIN + CID + AJAX_REQUEST_END, 
                     params = payload, 
                     headers = headers)
    
    # Converting response information into Python dictionary
    page_content = json.loads(r.text)
    total = page_content['iTotalRecords']
    
    r.close()
    
    return {'finished_query': total <= end, 'data': page_content['aaData'], 'total': total}

This helper function uses the above helper function. It extracts the information we want and writes it into the TSV passed in. This part mainly does the parsing where as the one above deals with the HTTP Request. Essentially a single call will write all the bands for the country requested into the designated TSV file.

# Helper function that uses the other helper function to get
# all the bands listed for a specific country
def get_bands_by_country(opened_file, writer, country,CID):

    data = []

    start = 0
    end = 0
    query_complete = False
    
    # Regexes
    name_ahref_tag = re.compile("<a href='(.*)'>(.*)</a>") # Name
    status_spanclass_tag = re.compile('<span class="(.*)">(.*)</span>') # Status
    website_regex = re.compile('https://www.metal-archives.com/bands/(.*)/(.*)') # Website and ID
    
    # Until the last page of the country is reached, keep querying
    while query_complete == False:

        start = end + 1
        end = end + 500
        query = get_page_query(country, CID, start, end)
        query_complete = query['finished_query']
        query_data = query['data'] # Contains Genres and Location

        band_info = []
        
        for idx in range(len(query_data)):
            
            band = query_data[idx]
            name_match = name_ahref_tag.match(band[0]).groups()
            status_match = status_spanclass_tag.match(band[3])

            name = name_match[1]
            website = name_match[0]
            
            ID = website_regex.match(website).groups()[1]
            genres = band[1]
            location = band[2]
            status = status_match.groups()[1]
            
            band_info = [name,ID,
                         country,CID,location,
                         genres,status,website]
            
            writer.writerow(band_info) # Write into passed in TSV file
            
            
    print(country + ' (' + CID + '): SUCCESS [' + str(query['total']) + ']')
    return data

Execute Query Code & Write To TSV¶

This part of the code executes all the helper functions above to write the information into a Tabs Separated File (TSV). Because the genres have commas in the string, I decided to use TSV to prevent problems with reading the file later on. To do this, you need to import the csv library:

import csv

To make this a TSV instead of CSV, just add this

delimiter = '\t'

field in your CSV writer. You can actually make a file with any separator you would with this feature.

If your computer is fast enough and you're also patient enough, you can watch the progress via the print statements. It shows the country and the number of bands queried.

# Creates a new TSV file and uses the two helper functions
# above to write the data to a file
with open('encyclopedia_metallum_data.tsv', 'w') as tsvfile:
   
    writer = csv.writer(tsvfile, delimiter = '\t')
    headings = ['band_name','band_id',
                'country', 'CID','location',
                'genres', 'status',
                'website']
    
    writer.writerow(headings)
    
    # Goes through every country retrieved earlier and calls helper function
    for CID,country in all_countries.items():
        get_bands_by_country(tsvfile,writer,country,CID)

tsvfile.close()

Afghanistan (AF): SUCCESS [1]
Åland Islands (AX): SUCCESS [7]
Albania (AL): SUCCESS [11]
Algeria (DZ): SUCCESS [23]
Andorra (AD): SUCCESS [4]
Angola (AO): SUCCESS [4]
Argentina (AR): SUCCESS [2016]
Armenia (AM): SUCCESS [18]
Aruba (AW): SUCCESS [1]
Australia (AU): SUCCESS [2382]
Austria (AT): SUCCESS [1125]
Azerbaijan (AZ): SUCCESS [8]
Bahrain (BH): SUCCESS [12]
Bangladesh (BD): SUCCESS [65]
Barbados (BB): SUCCESS [2]
Belarus (BY): SUCCESS [375]
Belgium (BE): SUCCESS [1196]
Belize (BZ): SUCCESS [6]
Bolivia (BO): SUCCESS [245]
Bosnia and Herzegovina (BA): SUCCESS [126]
Botswana (BW): SUCCESS [9]
Brazil (BR): SUCCESS [5622]
Brunei (BN): SUCCESS [20]
Bulgaria (BG): SUCCESS [381]
Cambodia (KH): SUCCESS [2]
Canada (CA): SUCCESS [3752]
Chile (CL): SUCCESS [1999]
China (CN): SUCCESS [300]
Colombia (CO): SUCCESS [1432]
Costa Rica (CR): SUCCESS [217]
Croatia (HR): SUCCESS [405]
Cuba (CU): SUCCESS [78]
Curaçao (CW): SUCCESS [1]
Cyprus (CY): SUCCESS [71]
Czech Republic (CZ): SUCCESS [1376]
Denmark (DK): SUCCESS [924]
Dominican Republic (DO): SUCCESS [31]
Ecuador (EC): SUCCESS [351]
Egypt (EG): SUCCESS [35]
El Salvador (SV): SUCCESS [140]
Estonia (EE): SUCCESS [195]
Ethiopia (ET): SUCCESS [1]
Faroe Islands (FO): SUCCESS [13]
Finland (FI): SUCCESS [3657]
France (FR): SUCCESS [4813]
French Polynesia (PF): SUCCESS [1]
Georgia (GE): SUCCESS [22]
Germany (DE): SUCCESS [10402]
Gibraltar (GI): SUCCESS [5]
Greece (GR): SUCCESS [1899]
Greenland (GL): SUCCESS [5]
Guam (GU): SUCCESS [5]
Guatemala (GT): SUCCESS [136]
Guernsey (GG): SUCCESS [13]
Guyana (GY): SUCCESS [3]
Honduras (HN): SUCCESS [46]
Hungary (HU): SUCCESS [1086]
Iceland (IS): SUCCESS [119]
India (IN): SUCCESS [194]
Indonesia (ID): SUCCESS [1623]
International (XX): SUCCESS [657]
Iran (IR): SUCCESS [78]
Iraq (IQ): SUCCESS [10]
Ireland (IE): SUCCESS [385]
Isle of Man (IM): SUCCESS [2]
Israel (IL): SUCCESS [259]
Italy (IT): SUCCESS [6179]
Jamaica (JM): SUCCESS [1]
Japan (JP): SUCCESS [1843]
Jersey (JE): SUCCESS [5]
Jordan (JO): SUCCESS [30]
Kazakhstan (KZ): SUCCESS [50]
Kenya (KE): SUCCESS [4]
Korea, South (KR): SUCCESS [235]
Kuwait (KW): SUCCESS [6]
Kyrgyzstan (KG): SUCCESS [7]
Laos (LA): SUCCESS [6]
Latvia (LV): SUCCESS [106]
Lebanon (LB): SUCCESS [47]
Libya (LY): SUCCESS [4]
Liechtenstein (LI): SUCCESS [12]
Lithuania (LT): SUCCESS [171]
Luxembourg (LU): SUCCESS [82]
Macedonia (FYROM) (MK): SUCCESS [101]
Madagascar (MG): SUCCESS [10]
Malaysia (MY): SUCCESS [591]
Maldives (MV): SUCCESS [8]
Malta (MT): SUCCESS [64]
Mauritius (MU): SUCCESS [2]
Mexico (MX): SUCCESS [2532]
Moldova (MD): SUCCESS [32]
Monaco (MC): SUCCESS [6]
Mongolia (MN): SUCCESS [6]
Montenegro (ME): SUCCESS [11]
Morocco (MA): SUCCESS [22]
Mozambique (MZ): SUCCESS [2]
Myanmar (MM): SUCCESS [6]
Namibia (NA): SUCCESS [2]
Nepal (NP): SUCCESS [24]
Netherlands (NL): SUCCESS [2264]
New Caledonia (NC): SUCCESS [6]
New Zealand (NZ): SUCCESS [364]
Nicaragua (NI): SUCCESS [26]
Norway (NO): SUCCESS [1622]
Oman (OM): SUCCESS [2]
Pakistan (PK): SUCCESS [49]
Panama (PA): SUCCESS [76]
Paraguay (PY): SUCCESS [154]
Peru (PE): SUCCESS [473]
Philippines (PH): SUCCESS [248]
Poland (PL): SUCCESS [3273]
Portugal (PT): SUCCESS [1274]
Puerto Rico (PR): SUCCESS [150]
Qatar (QA): SUCCESS [2]
Reunion (RE): SUCCESS [6]
Romania (RO): SUCCESS [378]
Russia (RU): SUCCESS [3647]
San Marino (SM): SUCCESS [2]
Saudi Arabia (SA): SUCCESS [12]
Serbia (RS): SUCCESS [393]
Singapore (SG): SUCCESS [253]
Slovakia (SK): SUCCESS [553]
Slovenia (SI): SUCCESS [269]
South Africa (ZA): SUCCESS [201]
Spain (ES): SUCCESS [3400]
Sri Lanka (LK): SUCCESS [37]
Suriname (SR): SUCCESS [2]
Svalbard (SJ): SUCCESS [3]
Sweden (SE): SUCCESS [4388]
Switzerland (CH): SUCCESS [989]
Syria (SY): SUCCESS [31]
Taiwan (TW): SUCCESS [69]
Tajikistan (TJ): SUCCESS [5]
Thailand (TH): SUCCESS [173]
Trinidad and Tobago (TT): SUCCESS [7]
Tunisia (TN): SUCCESS [23]
Turkey (TR): SUCCESS [498]
Turkmenistan (TM): SUCCESS [3]
Uganda (UG): SUCCESS [2]
Ukraine (UA): SUCCESS [912]
United Arab Emirates (AE): SUCCESS [26]
United Kingdom (GB): SUCCESS [4772]
United States (US): SUCCESS [25120]
Unknown (ZZ): SUCCESS [71]
Uruguay (UY): SUCCESS [194]
Uzbekistan (UZ): SUCCESS [7]
Venezuela (VE): SUCCESS [397]
Vietnam (VN): SUCCESS [39]

Load In Our TSV¶

After all that hard work, let's reap the reward. Using Pandas, we can easily read in a CSV,TSV, or any value separated file as a DataFrame. Make sure you've imported the library:

import pandas as pd

with open('encyclopedia_metallum_data.tsv', 'r') as MA_tsv:
    metal_data = pd.read_csv(MA_tsv, sep = '\t')
    
metal_data[['CID']] = metal_data[['CID']].replace(np.NAN, 'NA')
metal_data

Scrap Population Data¶

I'm sorry we're still scrapping data. This is it, I promise, then we get to the fun stuff. Here we extract population data from Wikipedia. Just like the part where we scrapped the country's page, we use regexes and BeautifulSoup.

Wikipedia HTML¶

Here's one row in the population data table for China:

<tr>
 <td>1</td>
 <td style="text-align:left;"><span class="flagicon" style="display:inline-block;width:25px;"><img alt="" class="thumbborder" data-file-height="600" data-file-width="900" height="15" src="//upload.wikimedia.org/wikipedia/commons/thumb/f/fa/Flag_of_the_People%27s_Republic_of_China.svg/23px-Flag_of_the_People%27s_Republic_of_China.svg.png" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/f/fa/Flag_of_the_People%27s_Republic_of_China.svg/35px-Flag_of_the_People%27s_Republic_of_China.svg.png 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/f/fa/Flag_of_the_People%27s_Republic_of_China.svg/45px-Flag_of_the_People%27s_Republic_of_China.svg.png 2x" width="23"/></span> <a href="/wiki/China" title="China">China</a><sup class="reference" id="cite_ref-5"><a href="#cite_note-5">[Note 2]</a></sup></td>
 <td>1,388,060,000</td>
 <td>December 14, 2017</td>
 <td>18.3%</td>
 <td style="text-align:left;"><a class="external text" href="http://worldpopulationclock.info/china" rel="nofollow">Official population clock</a></td>
</tr>

The country name is hidding in this tag:

<a href="/wiki/(COUNTRY LINK)" title="(COUNTRY)">(COUNTRY)</a>

We do the same thing we did earlier when extracting countries from Encyclopedia Metallum.

The population is in a <td> tag, so we use:

.find_all('td', style = False)

and that will give you this in a list:

<td>1</td>
<td>1,388,060,000</td>
<td>December 14, 2017</td>
<td>18.3%</td>

We simply get the 2nd element and parse the population using regexes.

# HTTP Request
WORLD_POPULATION_URL = 'https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population'
r = requests.get(WORLD_POPULATION_URL)

# BeautifulSoup & HTML
soup = BeautifulSoup(r.content, 'html5lib')
population_soup = soup.find('table',{'class': 'wikitable sortable'})
world_population = population_soup.find_all('tr')
world_population = world_population[1:] # Remove header row

# Regexes
population_regex = re.compile('<td>([0-9,]+)</td>')
country_regex = re.compile('<a href="/wiki/.*" title="(.*)">.*</a>')
country_redirect_regex = re.compile('<a class="mw-redirect" href="/wiki/.*" title="(.*)">.*</a>')

p_countries = []
population = []

with open('population_data_2017.csv', 'w') as population_csv:
    
    writer =  csv.writer(population_csv)
    writer.writerow(['country', 'population']) # Data Header
    
    # Parse tags
    for i in range(len(world_population)):
        country = world_population[i].find('a', href = True, class_ = False)
        # Territory of other countries, name in redirect link
        redirect = world_population[i].find('a', {'class', 'mw-redirect'})
        population_of_country = world_population[i].find_all('td', style = False)[1]
        
        c = country_regex.match(str(country).replace('\n',' '))
        r = country_redirect_regex.match(str(redirect))
        p = population_regex.match(str(population_of_country))
       
        if redirect != None:
            ctry = r.groups()[0]
        elif c != None:
            ctry = c.groups()[0]
            
        if p != None:
            pop = p.groups()[0].replace(',','')
            
        writer.writerow([ctry,pop])
        
    population_csv.close()

Load In Population CSV¶

It's a thing of beauty, congratulations! You are now well versed in web scrapping with Python.

with open('population_data_2017.csv', 'r') as population_csv:
    population_data = pd.read_csv(population_csv)
    population_csv.close()
    
population_data.set_index('country', inplace = True)
population_data

Part II: Data Analysis & Visualization¶

Genre Analysis
- Tranform DataFrame for Genre Analysis
- Overall Genre Counts
- Genre Count Bar Graph
- Country Genre Counts
- Genre by Country Plots
Top Metal Countries
- Calculating Per Capita
- Metal Choropleth Map

!pip install folium
import folium
import matplotlib.pyplot as plt
import seaborn as sns

Collecting folium
  Downloading folium-0.5.0.tar.gz (79kB)
    100% |████████████████████████████████| 81kB 1.1MB/s ta 0:00:01    64% |████████████████████▊           | 51kB 1.3MB/s eta 0:00:01
Collecting branca (from folium)
  Downloading branca-0.2.0-py3-none-any.whl
Requirement already satisfied: jinja2 in /opt/conda/lib/python3.6/site-packages (from folium)
Requirement already satisfied: requests in /opt/conda/lib/python3.6/site-packages (from folium)
Requirement already satisfied: six in /opt/conda/lib/python3.6/site-packages (from folium)
Requirement already satisfied: MarkupSafe>=0.23 in /opt/conda/lib/python3.6/site-packages (from jinja2->folium)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.6/site-packages (from requests->folium)
Requirement already satisfied: idna<2.7,>=2.5 in /opt/conda/lib/python3.6/site-packages (from requests->folium)
Requirement already satisfied: urllib3<1.23,>=1.21.1 in /opt/conda/lib/python3.6/site-packages (from requests->folium)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.6/site-packages (from requests->folium)
Building wheels for collected packages: folium
  Running setup.py bdist_wheel for folium ... done
  Stored in directory: /home/jovyan/.cache/pip/wheels/04/d0/a0/b2b8356443364ae79743fce0b9b6a5b045f7560742129fde22
Successfully built folium
Installing collected packages: branca, folium
Successfully installed branca-0.2.0 folium-0.5.0

Genre Analysis¶

Answers Question #1:¶

Which genre type seems to dominate the metal playing field?

Tranform DataFrame for Genre Analysis¶

Our metal bands data has a genre column, but right now it isn't in a form where we can do any genre specific analysis. We need to determine the big major metal genre's and show whether the band belongs to any of these major groups. Encyclopedia Metallum has us covered with the list above. Metal genres usually have an umbrella term and anything attached to them are subgenres, like Melodic Death Metal for example. To do this task, were going to use regexes again and put the major umbrella terms as the keywords. Unlike regular regex matching, you want to use:

regex = re.compile(genre_str)
regex.search(band_genre)

Search, goes through the entire string instead of matching from the beginning only, very similar to most find tools to search for keywords. We also want to keep track of number of genre types each band is classified as. This will allow us to see whether the genre type is mostly pure or mixed with other genres.

If you're interested in learning more about metal genres, you can check out Patrick Galbraith and Nick Grant's Interactive Map of Metal. The map allows you to see the evolution of metal music.

genre_regex_str = ['Black', 'Death', 'Doom|Stoner|Sludge', 
                   'Electronic|Industrial', 'Experimental|Avant-garde', 
                   'Folk|Viking|Pagan', 'Gothic', 'Grindcore|Goregrind', 
                   'Groove', 'Hard Rock', 'Heavy', 'Metalcore|Deathcore', 
                   'Power', 'Progressive', 'Speed', 'Symphonic', 'Thrash']

genre_regexes = []
genre_count = {}

# Create genre keyword regexes
for g in range(len(genre_regex_str)):
    genre_count[genre_regex_str[g]] = []
    genre_regexes.append(re.compile(genre_regex_str[g]))

# For every band, determine if they belong to any of the major metal genre types
# Also count how many genre types each band belongs to
genre_mix = []
for idx,band in metal_data.iterrows():
    mix = 0
    
    for idx in range(len(genre_regexes)):
        genre_group = genre_regex_str[idx]
        if genre_regexes[idx].search(band['genres']) != None:
            genre_count[genre_group].append(True)
            mix = mix + 1
        else:
            genre_count[genre_group].append(False)
    
    genre_mix.append(mix)

# Drop other columns not relevant to genre analysis
genre_data = metal_data.copy(deep = True)
genre_data = genre_data.drop(['band_id', 'location', 'website'], 1)
genre_data.insert(5, 'genre groups', genre_mix)

# Insert each genre column into original DataFrame
for idx in range(len(genre_regex_str)):
    genre = genre_regex_str[idx]
    genre_data.insert((idx + 6), genre, genre_count[genre])
    
genre_data

Overall Genre Counts¶

After the DataFrame transformation, we can easily do band counts for each genre. Use .value_counts() on the column you want to tally instead of going through the entire DataFrame with a for loop. The DataFrame is a nice presentation of the data, especially if you sort using:

.sort_values(column_name, ascending = False)

we can easily see which genres are the most popular, however there are some powerful graphing capabilities we can use to visualize the data instead of plain numbers.

genre_total_data = genre_data.copy(deep = True)
genre_count = []
genre_categories = list(genre_total_data.columns)[6:]
for idx in range(len(genre_categories)):
    counts = genre_total_data[genre_categories[idx]].value_counts()
    if True in counts:
        genre_count.append(counts[True])
    else:
        genre_count.append(0)

g = {'genre': genre_categories, 'band count': genre_count}
genre_total_df = pd.DataFrame(data = g, index = g['genre'], columns = ['band count'])
genre_total_df.sort_values('band count', ascending = False)

Genre Count Bar Graph¶

For beautiful and easy to use plotting, import matplotlib and seaborn. You can use matplotlib by itself, but seaborn is much more flexible and makes fine tuning how a graph looks much easier.

import matplotlib.pyplot as plt
import seaborn as sns

If you want to look at color options for plots here are some useful links:

matplotlib
seaborn
1. Color Palettes
2. Palette Example Code

Here we are plotting the DataFrame we created above and we can answer one of our analysis questions.

[ANSWER TO Q1A]: Which genre type seems to dominate the metal playing field in general?¶

Black, Death, Trash, Heavy, and Doom/Stoner/Sludge are the top 5 genres types in metal music. Which is not surprising if you're familiar with metal history, lower count genres tend to be newer genres where as the top 5 are ancestors to many of the other genre types and subgenres. It's pretty evident if you go look at the Map of Metal mentioned earlier.

sns.plt.title('Genre Classification Counts For Metal Bands') # Set graph title
genre_plot = sns.barplot(x = list(range(0,len(genre_total_df))), 
                 y = genre_total_df['band count'], 
                 palette = sns.color_palette("cubehelix", len(genre_total_df)))

genre_plot.set(xlabel = 'Genre Classification', 
       ylabel = 'Band Count', 
       xticklabels = list(genre_total_df.index)) # Set x-axis labels since python can't plot strings as values

sns.plt.xticks(rotation = 90) # Makes x-axis labels vertical
genre_plot.set_facecolor('white') # Changes tthe background to white
plt.show()

Country Genre Counts¶

So we know the top overall metals genres, but let's look at what genres dominate each country. Instead of using .value_counts() let's use pivot_table from Pandas. The pivot_table allows us to tally the bands with a genre type in a specific country.

genre_by_country = pd.pivot_table(data = genre_data,
                                  index = ['country', 'CID'], # Combines all rows with the same values for these columns
                                  values = genre_regex_str, # Give pandas the columns you want the aggregate function to apply
                                  aggfunc = np.sum) # Specify what type of function, default is 

genre_by_country.head()

Genre by Country Plots¶

Let's plot the DataFrame from above to see the results. Since I'll be plotting 17 genre plots, let's make it a function so that we don't have to repeat code 17 times.

def genre_bargraph(df, title, ylabel, genre, palette):
    plt.figure(figsize = (26,3)) # Change size of plot in inches
    sns.plt.title(title)
    ax = sns.barplot(x = list(df.index.labels[0]), 
                     y = df[genre], 
                     palette = palette)

    ax.set(xlabel = 'Country', 
           ylabel = ylabel, 
           xticklabels = list(df.index.levels[0]))

    sns.plt.xticks(rotation = 90)
    ax.set_facecolor('white')

15/17 plots shows that the United States has a lot of bands in all 17 genres. However this doesn not help us answer the question of what genres dominate each country. So we have to take the band counts and divide by the total number of bands in that country.

genre_lst = list(genre_by_country.columns)
l = len(genre_by_country)

titles = ['Black Metal','Death Metal','Doom Metal','Electronic/Industrial Metal',
          'Experimental/Avant-Garde Metal','Folk/Viking/Pagan Metal','Gothic Metal', 
          'Grindcore/Goregrind','Groove Metal','Hard Rock','Heavy Metal','Metalcore/Deathcore',
          'Power Metal','Progressive Metal','Speed Metal','Symphonic Metal','Thrash Metal']

for idx in range(len(genre_lst)):
    genre_bargraph(genre_by_country, 
                   'Band Count of ' + titles[idx] + ' by Country',
                   'Band Count',
                   genre_lst[idx], 
                   sns.cubehelix_palette(l, start = 0.5, rot = -0.75))
    
plt.show()

First We need to calculate how many bands are in each country.

band_count = pd.DataFrame(metal_data['country'].value_counts())
band_count = band_count.sort_index()
band_count.columns = ['band count']
band_count.head()

Now we can insert that into our original DataFrame and let's use NumPy to our advantage to calculate genre percentage. If you have two NumPy arrays of the same length, you can directly use:

a = np.array(dataA, dtype = np.float)
b = np.array(dataB, dtype = np.float)
list(a / b)

We do this for every genre and we're done!

genre_percentage_by_country = genre_by_country.copy(deep = True)
genre_percentage_by_country.insert(0, 'band count', list(band_count['band count'])) # Insert total band counts into DataFrame
lst = list(genre_percentage_by_country.columns)[1:]
for idx in range(len(lst)):
    curr_genre = genre_percentage_by_country[lst[idx]]
    genre_arr = np.array(curr_genre, dtype = np.float)
    total_arr = np.array(genre_percentage_by_country['band count'], dtype = np.float)
    genre_percentage_by_country[lst[idx]] = list(genre_arr / total_arr)
    
genre_percentage_by_country.head()

Let's regraph you can see for each country what genres are really prevalent in each country.

[ANSWER TO Q1B]: Which genre type seems to dominate the metal playing field in every country?¶

The answer is actually all of our 17 graphs! Let's look at one as an example: If the bars for that country is high, then this type of genre is a very popular genre for this specific country.

*You can double click on the graphs to enlarge them.

genre_lst = list(genre_percentage_by_country.columns)[1:]
l = len(genre_percentage_by_country)
for idx in range(len(genre_lst)):
    genre_bargraph(genre_percentage_by_country, 
                   'Percentage of ' + titles[idx] + ' by Country',
                   'Percentage',
                   genre_lst[idx], 
                   sns.cubehelix_palette(l, start = 1.5, rot = -0.75, dark = 0, light = 0.95))
    
plt.show()

[ANSWER TO Q1C]: Are most genres mixed or pure?¶

Looking at the graph we made more than 70,000 bands are pure one type of genre while around 40,000 bands are mix of 2 different genre types. Around 61% are pure and 34% are mix of 2, and around 40% are a mix of 2 or more genre types. I'm actually surprised since most bands are usually listed with so many genre tags.

purity_counts = pd.DataFrame(genre_purity_data['genre groups'].value_counts())
purity_counts

sns.plt.title('Mixed Metal Genre Counts') # Set graph title
genre_plot = sns.barplot(x = list(purity_counts.index), 
                         y = purity_counts['genre groups'], 
                         palette = sns.color_palette("cubehelix", len(purity_counts) * 2))

genre_plot.set(xlabel = 'Number of Genre Types', ylabel = 'Band Count') # Set x-axis and y-axis label names

genre_plot.set_facecolor('white') # Changes tthe background to white
plt.show()

Top Metal Countries¶

Answers Question #2:¶

Which country is the most metal?

Most metal bands in proportion to population
Most still active metal bands in proportion to population
Most split-up metal bands in proportion to population

Let's apply all of the python techiques to create a DataFrame for the status types for every country.

c = list(band_count.index)
status_types = list(overall_status_counts.index)
status_count_data = {}
status_count_data['Total'] = list(band_count['band count'])
status_count_data['Active'] = []
status_count_data['Split-up'] = []
status_count_data['Changed name'] = []
status_count_data['On hold'] = []
status_count_data['Unknown'] = []
status_count_data['Disputed'] = []

for idx in range(len(c)):
    status_counts = metal_data[metal_data['country'] == c[idx]]['status'].value_counts()
    for st in range(len(status_types)):
        if status_types[st] in status_counts:
            status_count_data[status_types[st]].append(status_counts[status_types[st]])
        else:
            status_count_data[status_types[st]].append(0)
            
status_data = pd.DataFrame(data = status_count_data, index = c, columns = ['Total'] + status_types)
status_data.head()

We need to insert the population data we scrapped earlier into the new DataFrame, we have to merge the two data sets together. Since there are some countries with different names in the data set, so I created an alias dictionary to solve the problem.

alias = {'Korea, South': 'South Korea',
         'Macedonia (FYROM)': 'Republic of Macedonia', 
         'Georgia': 'Georgia (country)',
         'Ireland': 'Republic of Ireland'}
pop_dataframe_countries = list(population_data.index)
countries = list(status_data.index)

population_info = []
for c in range(len(countries)):
    
    country = countries[c]
    if country in alias:
        country = alias[country]
    if country in pop_dataframe_countries:
        country_population = population_data.loc[country]['population']
        population_info.append(country_population)
    else:
        population_info.append(0)

status_data.insert(0, 'Population (2017)', population_info)
status_data = status_data[status_data['Population (2017)'] != 0] # Remove countries with no population match
status_data.head()

Calculating Per Capita¶

I wanted to calculate 3 different per capitas, one for overall, one for currently active bands, and one for bands that have split-up. This will tell us which countries have the most metal music activity in proportion to population.

The equation for calculating Per Capita (# of bands per person):

$\text{Per Capita} = \dfrac{\text{band count}}{\text{population}} \times \text{(scalar)}$

The scalar let's you control the proportion, if you want the original definition of per capita, the scalar is 1 for each person in the population. I'm going to do per every ten thousand individuals so my scalar is 10,000.

total_arr = np.array(list(status_data['Total']), dtype = np.float)
active_arr = np.array(list(status_data['Active']), dtype = np.float)
split_arr = np.array(list(status_data['Split-up']), dtype = np.float)
population_arr = np.array(list(status_data['Population (2017)']), dtype = np.float)

status_data.insert(2, 'Total Per Capita (10,000)', list((total_arr / population_arr) * 10000))
status_data.insert(4, 'Active Per Capita (10,000)', list((active_arr / population_arr) * 10000))
status_data.insert(6, 'Split-up Per Capita (10,000)', list((split_arr / population_arr) * 10000))
status_data.head()

Metal Choropleth Map¶

Here, we're going to use folium make an interactive choropleth map, it'll let us see which country is the most metal better than a bar graph. I was inspired by this Metal Injection Blog Post.

!pip install folium
import folium

The GEOJSON file of the world is located here

It isn't exactly difficult if your data matches with the GEOJSON file exactly, but in our case, the file has information for a lot more countries than there are in the DataFrame. The subsequent 2 cells contain code that modify the DataFrame to include all the countries in the GEOJSON file and for countries that were not in our DataFrame, we have to give them filler values.

choropleth_data = status_data.copy(deep = True)
alias = {'Serbia': 'Republic of Serbia',
         'Macedonia (FYROM)': 'Macedonia',
         'Korea, South': 'South Korea',
         'United States': 'United States of America'}

# Going through the JSON file to extract all the countries
with open('countries.geo.json', 'r') as country_boundaries_geojson:
    country_boundaries = json.load(country_boundaries_geojson)
    cb = [country_boundaries['features'][i]['properties']['ADMIN']for i in range(len(country_boundaries['features']))]
    country_boundaries_geojson.close()
    
l = list(choropleth_data.index)
geojson_country_name = []

# Making a column of the country name from the JSON file
# In order to make the map, all country names in DataFrame must
# Match the names in the JSON file.
for i in range(len(l)):
    if l[i] in alias:
        geojson_country_name.append(alias[l[i]])
    elif l[i] in cb:
        geojson_country_name.append(l[i])
        
choropleth_data.insert(0, 'geojson_country_name', geojson_country_name)
choropleth_data = choropleth_data.drop(['Unknown','Changed name','On hold','Disputed'], 1)
choropleth_data.head()

# Collect all the countries not matched from the JSON file
no_match = [cb[c] for c in range(len(cb)) if cb[c] not in geojson_country_name]

# Creating a DataFrame for the non-matched countries
# and giving them filler values
filler = [-1] * (len(no_match))
filler_data = {}
choropleth_col = list(choropleth_data.columns)
for idx in range(len(choropleth_data.columns)):
    h = choropleth_col[idx]
    if h == 'country':
        filler_data[h] = [''] * (len(no_match))
    elif h == 'CID':
        filler_data[h] = ['--'] * (len(no_match))
    elif h == 'geojson_country_name':
        filler_data[h] = no_match
    else:
        filler_data[h] = filler

# Combine original DataFrame with non-match DataFrame
geojson_no_matches = pd.DataFrame(data = filler_data,columns = choropleth_col)
choropleth_data = choropleth_data.append(geojson_no_matches)
choropleth_data

folium has a built in choropleth function. Because I have to graph 3 of them, I made a function and I had them all added in as layers. Because the map is so tasking on the computer, saving it as an html file is a good option if your computer can't handle the large amount of graphics.

View the full map by downloading it here

With the map layers, we answer our 2nd question.

[ANSWER TO Q2A-C]: Which country is the most metal?¶

Finland and those Scandinavians! Finland is the top for all 3 maps and that shows that metal music activity in Finland is extremely high even though they are not the one of the most population dense countries in the world.

Here are some resources for choropleth maps:

# Function to add choropleth overlay to base map
def build_choropleth(base_map, name, field, threshold, color):
    base_map.choropleth(name = name + 'Metal Bands Per Capita (10,000)',
                geo_data = 'countries.geo.json', # JSON file containing the country boundary lines
                data = choropleth_data, # DataFrame to plot from
                columns = ['geojson_country_name', field + ' Per Capita (10,000)'], # Which columns to plot
                key_on = 'feature.properties.ADMIN', # What property in the JSON file to match the column data with
                threshold_scale = threshold, # Adjusts legend color scale
                fill_color = color, fill_opacity = 0.5, line_opacity = 0.5, # Choropleth color overlay options
                legend_name = name + 'Metal Bands Per Capita (10,000)',
                reset = True, highlight = True)
    
base_map = folium.Map(location = [0,0], zoom_start = 1.5)
build_choropleth(base_map, '', 'Total', [0,1,2,5,6,7], 'BuPu')
build_choropleth(base_map, 'Active', 'Active', [-1,0,1,2,3,4], 'BuGn')
build_choropleth(base_map, 'Split-up', 'Split-up', [-1.0, -0.1, 0.4, 0.9, 1.4, 1.9], 'YlOrRd')

folium.LayerControl().add_to(base_map)
base_map.save('metalband_choropleth_map.html') # Save as html because most computers can't handle the loading the map

Conculsion & Major Souces¶

If you have stuck with me till the end, then congratulations! We did some incredible things in Python 3 and answered these metal music questions:

Which genre type seems to dominate the metal playing field?
Which country is the most metal?

In comparison with the kaggle notebook, we had pretty similar results even though the data set here is 20 times larger. Our most metal country was Finland, but for the kaggle analysis it was the Faroes Islands, which is actually an island off of the Scandinavian paninsulas. A lot of the genre analysis was similar as well.

Thanks for reading and following along!

Sources¶

Data Sources
Inpiration/Reference Sources

	band_name	band_id	country	CID	location	genres	status	website
0	District Unknown	3540379792	Afghanistan	AF	Kabul	Progressive Groove Metal	Active	https://www.metal-archives.com/bands/District_...
1	Contamination of Expressions	118214	Åland Islands	AX	Mariehamn	Doom/Heavy Metal	Split-up	https://www.metal-archives.com/bands/Contamina...
2	Eternity	3540369455	Åland Islands	AX	NaN	Heavy Metal	Split-up	https://www.metal-archives.com/bands/Eternity/...
3	Haudankaivaja	127805	Åland Islands	AX	Ã–dkarby	Death Metal	Changed name	https://www.metal-archives.com/bands/Haudankai...
4	Hiisi	113902	Åland Islands	AX	NaN	Folk/Black Metal	Split-up	https://www.metal-archives.com/bands/Hiisi/113902
5	Lars Eric Mattsson	15483	Åland Islands	AX	NaN	Progressive Metal/Shred, Hard Rock	Active	https://www.metal-archives.com/bands/Lars_Eric...
6	Red Room Ensemble	3540393255	Åland Islands	AX	Mariehamn	Melodic Heavy Metal	Active	https://www.metal-archives.com/bands/Red_Room_...
7	Vorum	3540278890	Åland Islands	AX	Mariehamn	Death Metal	Active	https://www.metal-archives.com/bands/Vorum/354...
8	Aten	3540381609	Albania	AL	Tirana, Tirana County	Death Metal	Active	https://www.metal-archives.com/bands/Aten/3540...
9	Centaur	3540334105	Albania	AL	Tirana, Tirana County	Heavy Metal (early), Hard Rock (later)	Active	https://www.metal-archives.com/bands/Centaur/3...
10	Crossbones	109767	Albania	AL	Tirana, Tirana County	Heavy/Thrash Metal	Active	https://www.metal-archives.com/bands/Crossbone...
11	Ethernil	3540389667	Albania	AL	Tirana, Tirana County	Death Metal	Active	https://www.metal-archives.com/bands/Ethernil/...
12	Gverr	3540318477	Albania	AL	Tirana, Tirana County	Progressive Death Metal	Active	https://www.metal-archives.com/bands/Gverr/354...
13	Nihil	3540325402	Albania	AL	Tirana, Tirana County	Black Metal	Active	https://www.metal-archives.com/bands/Nihil/354...
14	Pureblood	3540357028	Albania	AL	NaN	Black Metal	Split-up	https://www.metal-archives.com/bands/Pureblood...
15	Radical Obscurity	3540334106	Albania	AL	Tirana, Tirana County	Death Metal (early), Crossover/Thrash/Groove M...	Split-up	https://www.metal-archives.com/bands/Radical_O...
16	Shiptarian Darkness	3540426934	Albania	AL	Durres	Raw Black Metal	Unknown	https://www.metal-archives.com/bands/Shiptaria...
17	Thunder Way	69422	Albania	AL	Tirana, Tirana County	Power/Speed Metal	Split-up	https://www.metal-archives.com/bands/Thunder_W...
18	Wild Feeling	3540262336	Albania	AL	Peshkopi, DibÃ«r County	Folk/Heavy Metal	Split-up	https://www.metal-archives.com/bands/Wild_Feel...
19	Azael	14086	Algeria	DZ	Reghaia, Algiers Province	Black Metal	Split-up	https://www.metal-archives.com/bands/Azael/14086
20	Azar	97016	Algeria	DZ	Tizi Ouzou, Tizi Ouzou Province	Folk/Black Metal	Unknown	https://www.metal-archives.com/bands/Azar/97016
21	Bactotavaiy	86147	Algeria	DZ	Guelma, Guelma Province	Black Metal	Split-up	https://www.metal-archives.com/bands/Bactotava...
22	Barbaros	14079	Algeria	DZ	Algiers, Algiers Province	Black Metal	Active	https://www.metal-archives.com/bands/Barbaros/...
23	Carnavage	36073	Algeria	DZ	Algiers, Algiers Province	Grindcore	Split-up	https://www.metal-archives.com/bands/Carnavage...
24	Celestial Decay	95405	Algeria	DZ	Kouba, Algiers Province	Black Metal	Split-up	https://www.metal-archives.com/bands/Celestial...
25	Devast	96684	Algeria	DZ	Algiers, Algiers Province (early) / Buenos Air...	Technical/Brutal Death Metal	Active	https://www.metal-archives.com/bands/Devast/96684
26	Entropy	12799	Algeria	DZ	Constantine, Constantine Province	Heavy Metal/Hard Rock	Split-up	https://www.metal-archives.com/bands/Entropy/1...
27	Hellium	120259	Algeria	DZ	Blida, Blida Province	Death/Groove Metal	Active	https://www.metal-archives.com/bands/Hellium/1...
28	Homicidal	33921	Algeria	DZ	Algiers, Algiers Province	Death Metal/Grindcore	Active	https://www.metal-archives.com/bands/Homicidal...
29	Lelahell	3540341906	Algeria	DZ	Algiers, Algiers Province	Death Metal	Active	https://www.metal-archives.com/bands/Lelahell/...
...	...	...	...	...	...	...	...	...
119444	Da Vàng	2002	Vietnam	VN	Ho Chi Minh City (early), Los Angeles, United ...	Progressive Metal	Active	https://www.metal-archives.com/bands/Da_V%C3%A...
119445	Deep Wound	3540274641	Vietnam	VN	NaN	Black Metal	Split-up	https://www.metal-archives.com/bands/Deep_Woun...
119446	Disgusted	109162	Vietnam	VN	Ho Chi Minh City	Death Metal	On hold	https://www.metal-archives.com/bands/Disgusted...
119447	End of Road	82113	Vietnam	VN	Ho Chi Minh City	Death Metal	Split-up	https://www.metal-archives.com/bands/End_of_Ro...
119448	Final Stage	47865	Vietnam	VN	Hanoi	Melodic Death Metal	Split-up	https://www.metal-archives.com/bands/Final_Sta...
119449	God Father	87618	Vietnam	VN	Ho Chi Minh City	Heavy Metal	Split-up	https://www.metal-archives.com/bands/God_Fathe...
119450	Maxich	104398	Vietnam	VN	Ho Chi Minh City	Hard Rock/Progressive Heavy Metal	Split-up	https://www.metal-archives.com/bands/Maxich/10...
119451	Motorbreath	47412	Vietnam	VN	NaN	Black Metal	Split-up	https://www.metal-archives.com/bands/Motorbrea...
119452	Ngũ Cung	121049	Vietnam	VN	Hanoi	Progressive Metal/Rock	Active	https://www.metal-archives.com/bands/Ng%C5%A9_...
119453	Nhiệt Huyết	122230	Vietnam	VN	Da Nang	Thrash Metal	Split-up	https://www.metal-archives.com/bands/Nhi%E1%BB...
119454	Nostalgia	3540261512	Vietnam	VN	NaN	Black Metal	Split-up	https://www.metal-archives.com/bands/Nostalgia...
119455	Omerta	3540320447	Vietnam	VN	Ho Chi Minh City	Thrash/Death Metal	Split-up	https://www.metal-archives.com/bands/Omerta/35...
119456	Rot	28262	Vietnam	VN	Ho Chi Minh City	Raw Black Metal	Active	https://www.metal-archives.com/bands/Rot/28262
119457	Rotten Fetus	84955	Vietnam	VN	Gia Lai	Death Metal/Grindcore	Split-up	https://www.metal-archives.com/bands/Rotten_Fe...
119458	SagoMetal	86337	Vietnam	VN	Ho Chi Minh City	Melodic Death Metal/Metalcore	Active	https://www.metal-archives.com/bands/SagoMetal...
119459	Seismic Origin	3540389582	Vietnam	VN	Ho Chi Minh City	Melodic Death/Groove Metal	Split-up	https://www.metal-archives.com/bands/Seismic_O...
119460	Selbstmord	114844	Vietnam	VN	Vietnam (early), United States (later)	Black Metal	Unknown	https://www.metal-archives.com/bands/Selbstmor...
119461	Silent	120672	Vietnam	VN	Da Nang	Melodic Death Metal	Split-up	https://www.metal-archives.com/bands/Silent/12...
119462	Sói Đen	18310	Vietnam	VN	Da Nang	Folk Metal	Split-up	https://www.metal-archives.com/bands/S%C3%B3i_...
119463	Sorrow Decadence	87549	Vietnam	VN	Little SaiGon, California (USA)	Melodic Death Metal	Split-up	https://www.metal-archives.com/bands/Sorrow_De...
119464	Steel Owl	99476	Vietnam	VN	California, USA	Heavy Metal/Hard Rock	Split-up	https://www.metal-archives.com/bands/Steel_Owl...
119465	Still Rock	3540263814	Vietnam	VN	Ho Chi Minh City	Heavy Metal/Hard Rock	Unknown	https://www.metal-archives.com/bands/Still_Roc...
119466	Thập Tự Ngược	3540282770	Vietnam	VN	NaN	Black Metal	Split-up	https://www.metal-archives.com/bands/Th%E1%BA%...
119467	The Light	12893	Vietnam	VN	Hanoi	Thrash Metal	Split-up	https://www.metal-archives.com/bands/The_Light...
119468	Thuỷ Triều Ðỏ	15789	Vietnam	VN	Hanoi	Progressive Metal	Unknown	https://www.metal-archives.com/bands/Thu%E1%BB...
119469	Titanium	82114	Vietnam	VN	Ho Chi Minh City	Thrash Metal	Changed name	https://www.metal-archives.com/bands/Titanium/...
119470	Unlimited	57341	Vietnam	VN	Ho Chi Minh City	Melodic Power/Speed Metal	Unknown	https://www.metal-archives.com/bands/Unlimited...
119471	Voluptuary	3540386559	Vietnam	VN	Hanoi	Groove/Death Metal	Active	https://www.metal-archives.com/bands/Voluptuar...
119472	Wừu	3540356050	Vietnam	VN	Ho Chi Minh City	Grindcore	Split-up	https://www.metal-archives.com/bands/W%E1%BB%A...
119473	Đông Đô	3540316361	Vietnam	VN	Hanoi	Symphonic Power Metal	Active	https://www.metal-archives.com/bands/%C4%90%C3...

	population
country
China	1388060000
India	1325200000
United States	326275000
Indonesia	261890900
Pakistan	209904000
Brazil	208390000
Nigeria	193392500
Bangladesh	163646000
Russia	146854419
Japan	126670000
Mexico	123675351
Philippines	105021000
Egypt	96186400
Ethiopia	94352000
Vietnam	93700000
Germany	82800000
Democratic Republic of the Congo	81339900
Iran	81039400
Turkey	79814871
France	67182000
Thailand	66061000
United Kingdom	65648000
Italy	60504233
South Africa	56717000
Myanmar	53370609
Tanzania	51557365
South Korea	51446201
Kenya	49699862
Colombia	49538200
Spain	46549045
...	...
Marshall Islands	55000
South Ossetia	53532
Faroe Islands	50358
Saint Kitts and Nevis	46204
Sint Maarten	39410
Turks and Caicos Islands	37910
Liechtenstein	37815
Monaco	37550
Saint-Martin	36457
San Marino	33265
Gibraltar	33140
British Virgin Islands	28514
Cook Islands	18100
Palau	17800
Anguilla	13452
Wallis and Futuna	11800
Nauru	10800
Tuvalu	10100
Saint Barthélemy	9417
Saint Pierre and Miquelon	6286
Saint Helena, Ascension and Tristan da Cunha	5633
Montserrat	4922
Falkland Islands	2563
Norfolk Island	2302
Christmas Island	2072
Niue	1470
Tokelau	1411
Vatican City	800
Cocos (Keeling) Islands	550
Pitcairn Islands	57

		band count	Black	Death	Doom\|Stoner\|Sludge	Electronic\|Industrial	Experimental\|Avant-garde	Folk\|Viking\|Pagan	Gothic	Grindcore\|Goregrind	Groove	Hard Rock	Heavy	Metalcore\|Deathcore	Power	Progressive	Speed	Symphonic	Thrash
country	CID
Afghanistan	AF	1	0.000000	0.000000	0.000000	0.0	0.0	0.000000	0.0	0.000000	1.000000	0.000000	0.000000	0.00	0.000000	1.000000	0.000000	0.0	0.000000
Albania	AL	11	0.272727	0.363636	0.000000	0.0	0.0	0.090909	0.0	0.000000	0.090909	0.090909	0.272727	0.00	0.090909	0.090909	0.090909	0.0	0.181818
Algeria	DZ	23	0.521739	0.304348	0.086957	0.0	0.0	0.086957	0.0	0.086957	0.043478	0.043478	0.043478	0.00	0.000000	0.043478	0.000000	0.0	0.043478
Andorra	AD	4	0.000000	0.500000	0.250000	0.0	0.0	0.000000	0.0	0.250000	0.000000	0.000000	0.000000	0.00	0.250000	0.500000	0.000000	0.0	0.000000
Angola	AO	4	0.250000	0.000000	0.000000	0.0	0.0	0.000000	0.0	0.000000	0.000000	0.000000	0.250000	0.25	0.000000	0.000000	0.000000	0.0	0.250000

	genre groups
1	72692
2	40039
3	5662
0	678
4	384
5	18
6	1

	geojson_country_name	Population (2017)	Total	Total Per Capita (10,000)	Active	Active Per Capita (10,000)	Split-up	Split-up Per Capita (10,000)
Afghanistan	Afghanistan	29724323	1	0.000336	1	0.000336	0	0.000000
Albania	Albania	2876591	11	0.038240	6	0.020858	4	0.013905
Algeria	Algeria	41697498	23	0.005516	10	0.002398	10	0.002398
Andorra	Andorra	78264	4	0.511091	3	0.383318	1	0.127773
Angola	Angola	28359634	4	0.001410	4	0.001410	0	0.000000
Argentina	Argentina	44044811	2016	0.457716	1326	0.301057	481	0.109207
Armenia	Armenia	2979600	18	0.060411	14	0.046986	4	0.013425
Aruba	Aruba	110882	1	0.090186	0	0.000000	0	0.000000
Australia	Australia	24746700	2382	0.962553	1350	0.545527	717	0.289736
Austria	Austria	8817514	1125	1.275870	564	0.639636	407	0.461581
Azerbaijan	Azerbaijan	9867250	8	0.008108	6	0.006081	2	0.002027
Bahrain	Bahrain	1451200	12	0.082690	6	0.041345	6	0.041345
Bangladesh	Bangladesh	163646000	65	0.003972	56	0.003422	7	0.000428
Barbados	Barbados	285719	2	0.069999	2	0.069999	0	0.000000
Belarus	Belarus	9495800	375	0.394911	247	0.260115	90	0.094779
Belgium	Belgium	11385383	1196	1.050470	535	0.469901	546	0.479562
Belize	Belize	380010	6	0.157891	4	0.105260	1	0.026315
Bolivia	Bolivia	11145770	245	0.219814	181	0.162393	39	0.034991
Bosnia and Herzegovina	Bosnia and Herzegovina	3518000	126	0.358158	78	0.221717	34	0.096646
Botswana	Botswana	2230905	9	0.040342	8	0.035860	1	0.004482
Brazil	Brazil	208390000	5622	0.269783	3413	0.163779	1607	0.077115
Brunei	Brunei	417200	20	0.479386	9	0.215724	8	0.191755
Bulgaria	Bulgaria	7101859	381	0.536479	180	0.253455	170	0.239374
Cambodia	Cambodia	15848495	2	0.001262	2	0.001262	0	0.000000
Canada	Canada	36746700	3752	1.021044	2022	0.550253	1294	0.352140
Chile	Chile	17373831	1999	1.150581	1483	0.853583	333	0.191668
China	China	1388060000	300	0.002161	241	0.001736	43	0.000310
Colombia	Colombia	49538200	1432	0.289070	1111	0.224271	204	0.041180
Costa Rica	Costa Rica	4947490	217	0.438606	165	0.333502	32	0.064679
Croatia	Croatia	4154213	405	0.974914	204	0.491068	136	0.327378
...	...	...	...	...	...	...	...	...
82	Saint Helena	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
83	Solomon Islands	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
84	Sierra Leone	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
85	Somaliland	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
86	Somalia	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
87	Saint Pierre and Miquelon	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
88	Sao Tome and Principe	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
89	Swaziland	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
90	Sint Maarten	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
91	Seychelles	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
92	Turks and Caicos Islands	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
93	Chad	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
94	Togo	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
95	East Timor	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
96	Tonga	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
97	Tuvalu	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
98	United Republic of Tanzania	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
99	United States Minor Outlying Islands	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
100	US Naval Base Guantanamo Bay	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
101	Vatican	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
102	Saint Vincent and the Grenadines	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
103	British Virgin Islands	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
104	United States Virgin Islands	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
105	Vanuatu	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
106	Wallis and Futuna	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
107	Akrotiri Sovereign Base Area	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
108	Samoa	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
109	Yemen	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
110	Zambia	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000
111	Zimbabwe	-1	-1	-1.000000	-1	-1.000000	-1	-1.000000

	band count
Death	40814
Black	31541
Thrash	25267
Heavy	17209
Doom\|Stoner\|Sludge	11590
Progressive	8191
Power	7327
Grindcore\|Goregrind	4885
Metalcore\|Deathcore	4515
Groove	4367
Gothic	3486
Folk\|Viking\|Pagan	2823
Symphonic	2603
Speed	2225
Hard Rock	2209
Experimental\|Avant-garde	1175
Electronic\|Industrial	1161