gScrape - Video Game Info Scraper

billyc666

New member
RL Member
Thanks that works great I never realised I could copy from the info viewer.
is there anyway for me just to get it to scrape the info I need ie just for it to scrape publisher, rating , score , and info (for players)
 

billyc666

New member
RL Member
just a quick question before I start , do you have a full ps1 and ps2 xml scraped that I can use in the info viewer

Great app by the way
 

dustind900

Member
Supporter
RL Member
Project is still alive!!! Basically I was unsatisfied with the app so I learned a new programming language (.net, C#, and WPF) and started rewriting everything. Here is a quick demo of the "GamesDB" section of the NEW gScrape.

Watch on YouTube for HD...
 

billyc666

New member
RL Member
looking forward to this, its already been a big help filling in the info for dark's aeon nox hyperspin theme

**edit** Forgot to ask , if a field isn't filled by one scraper will it be filled by another. ie mobygames doesn't have score but other scraper does
 
Last edited:

dustind900

Member
Supporter
RL Member
if a field isn't filled by one scraper will it be filled by another. ie mobygames doesn't have score but other scraper does
Yes and No. I've decided to go with individual files for each site/game. I'm not a huge fan of huge XML files. There will be an info merge feature. ;)


What to expect:
  • Game/System info from TheGamesDB
  • Game info from MobyGames
  • System info from VideoGameConsoleLibrary

There will be more but I don't want to get a head of myself. There will be some slight changes to the moby XML format. Currently the XML format will not validate correctly because of all the nested "country" tags. I have a new format now that will validate correctly. It won't take long to rebuild your collection though, because gScrape now fully utilizes multithreading. Not only will you be blown away by the speed at which gScrape collects info... You will also be blown away by the fact that you can collect info from multiple sites at once.

I'll let you guys chew on that for a while. No time frame for a release... Work is super busy right now. I code as much as I can when I can.

gScrape WILL be the best game scraper available...
 

brolly

Administrator
Developer
Looking good dustin, and you surely won't regret the technology change ;)
I totally agree with keeping separate XML files for each file.

You know you'll need a GameFAQs scrapper afterwards lol
 

dustind900

Member
Supporter
RL Member
GameFAQs actually has a nicely structured site. It's rather easy to scrape info from the site. The only downside is if you hit their servers with too many web requests they will IP ban you quick.
 

dustind900

Member
Supporter
RL Member
I'm experiencing a bug that is really driving me nuts... If anyone can help let me know.

The problem is some of my web requests return a 500 internal server error. Normally this would be nothing to bother with, but it happens on urls you can visit with no problems in the browser. I've tried using WebClient(), a direct HttpRequest, and I've tried loading the page using XDocument.Load(url). I'm starting to assume this is a bug on TheGamesDB end, but I figured I would post here to see if anyone can help me out.

Trouble Url = http://thegamesdb.net/api/GetGame.php?id=17594
GameName = IronSoul
System = PC

Maybe it's my PC?

Out of 30000 some games this only affects about 5 games that I've noticed so far. Really this isn't a huge problem, but if it can be fixed...
 

brolly

Administrator
Developer
Error 500 indicates a server side error, but since the URL does work from a browser perhaps there's something strange being sent on your request headers and the server isn't liking that. But since it does work with all other requests it's pretty strange indeed.
Hard to make a guess without actually trying to debug it. Your best bet would be making contact with the site owner so he could look into the server side to see what's causing the issue, since they are providing an API it should be ok to contact him.
 

dustind900

Member
Supporter
RL Member
Info for MobyGames will be separated per game and each game will have multiple info files (users choice). There will be a basic info, ratings info, release info, and others.

Basic Info will look like this:
Code:
<?xml version="1.0" encoding="utf-8"?>
<game>
  <id>minecraft</id>
  <name>Minecraft</name>
  <system url="/browse/games/browser/">Browser</system>
  <userscore>3.3</userscore>
  <publishers>
    <publisher>Mojang*AB</publisher>
  </publishers>
  <developers>
    <developer>Mojang*AB</developer>
  </developers>
  <releases>
    <release>Jun 30, 2010</release>
  </releases>
  <platforms>
    <platform url="/game/linux/minecraft">Linux</platform>
    <platform url="/game/macintosh/minecraft">Macintosh</platform>
    <platform url="/game/windows/minecraft">Windows</platform>
    <platform url="/browse/games/browser/">Browser</platform>
  </platforms>
  <genres>
    <genre>Action</genre>
  </genres>
  <perspectives>
    <perspective>1st-Person*Perspective</perspective>
    <perspective>3rd-Person*Perspective</perspective>
  </perspectives>
  <subgenres>
    <subgenre type="Theme">Fantasy</subgenre>
  </subgenres>
  <alternatetitles>
    <title name="Development title">Survival Mode</title>
    <title name="Beta release">Minecraft Beta</title>
    <title name="Alpha release">Minecraft Alpha</title>
    <title name="Working title">Infdev</title>
    <title name="Working title">Indev</title>
  </alternatetitles>
  <groups>
    <group url="/game-group/gameplay-feature-alchemy">Gameplay feature: Alchemy</group>
    <group url="/game-group/gameplay-feature-armourweapon-deterioration">Gameplay Feature: Armour/Weapon Deterioration</group>
    <group url="/game-group/gameplay-feature-blacksmithing">Gameplay feature: Blacksmithing</group>
    <group url="/game-group/gameplay-feature-botany-farming-gardening">Gameplay feature: Botany, Farming, Gardening</group>
    <group url="/game-group/gameplay-feature-creature-breeding-fusion">Gameplay feature: Creature breeding / fusion</group>
    <group url="/game-group/gameplay-feature-day-night-cycle">Gameplay feature: Day / Night cycle</group>
    <group url="/game-group/gameplay-feature-fishing">Gameplay feature: Fishing</group>
    <group url="/game-group/gameplay-feature-freely-destructible-terrain">Gameplay feature: Freely destructible terrain</group>
    <group url="/game-group/gameplay-feature-hunger-thirst">Gameplay feature: Hunger / Thirst</group>
    <group url="/game-group/gameplay-feature-hunting">Gameplay feature: Hunting</group>
    <group url="/game-group/gameplay-feature-mining">Gameplay feature: Mining</group>
    <group url="/game-group/gameplay-feature-survival-cooking">Gameplay feature: Survival cooking</group>
    <group url="/game-group/games-with-randomly-generated-environments">Games with randomly generated environments</group>
    <group url="/game-group/genre-wilderness-survival">Genre: Wilderness Survival</group>
    <group url="/game-group/minecraft-series">Minecraft series</group>
    <group url="/game-group/user-fan-contributed-content">User / fan contributed content</group>
    <group url="/game-group/visual-technique-style-voxel-graphics">Visual technique / style: Voxel graphics</group>
  </groups>
  <relatedwebsites>
    <website url="http://www.minecraft.net/">Minecraft</website>
  </relatedwebsites>
  <description>Minecraft is a game which mixes elements of sandbox, survival horror and construction game. It is an evolved version of Minecraft Classic.As in Minecraft Classic the player is dropped in a randomly generated world made out of cubes. Unlike the classic version where the player can add and remove blocks at will, in this version all blocks the player wants to place must first be "mined" elsewhere. Some materials require special equipment to be mined. Stone requires a wooden pick axe to be mined; gold requires an iron pick axe and obsidian requires a diamond pick axe. The player can also create shovels and normal axes to mine sand, dirt and wood faster. Mining is no longer instant but requires the player to hit the block a couple times, tools reduce the time.Some blocks can not be found in nature but require crafting. Clay, for example, can be split into clay balls, which when baked turn into bricks which can be combined to form brick wall blocks. Baking happens in a stone oven, which requires a steady supply of coal and the oven itself needs to be crafted at a workbench, which needs to be constructed first.This time the player(s) are not alone. During the day pigs, chickens, cows and sheep roam the land. Some of them, when killed or struck, will drop valuable items such as leather required for protective clothing or pork chops which when baked can heal the player. But at night and in the dark randomly generated caverns monsters rule the land. There are various different monsters, including the zombie, a skeleton archer, an exploding creeper, deadly spiders and gelatinous cubes. Aside from randomly spawning in dark areas, they also spawn in so-called "mob spawners" which spawn enemies indefinitely until properly illuminated or destroyed. To defend himself against enemies the player can craft weapons such as a sword and bow, and protective clothing to reduce damage. When the player dies his items are dropped at the place of his death, but the player respawns at his original spawn point. Items can be recovered if the player reaches them within five minutes (unless they fell into lava).The randomly generated worlds are structured in such a fashion that more valuable resources are either rare or only spawn in deep caverns far below the ground.Aside from building blocks the game also, unlike its previous versions, offers more complex building. The player can create railway systems and ride mine carts, row in a small boat, and build pressure plates, switches, doors and electrical circuits to power various contraptions.The game features no pre-set goals and advocates exploration and construction.</description>
</game>
Code:
<?xml version="1.0" encoding="utf-8"?>
<game>
  <id>destiny_</id>
  <name>Destiny</name>
  <system url="/browse/games/playstation-4/">PlayStation 4</system>
  <officialsite url="http://www.destinythegame.com/">Destiny</officialsite>
  <criticscore>73</criticscore>
  <userscore>3.3</userscore>
  <publishers>
    <publisher>Activision*Publishing,*Inc.</publisher>
  </publishers>
  <developers>
    <developer>Bungie,*LLC</developer>
  </developers>
  <releases>
    <release>Sep 09, 2014</release>
  </releases>
  <platforms>
    <platform url="/game/ps3/destiny_">PlayStation 3</platform>
    <platform url="/game/xbox360/destiny_">Xbox 360</platform>
    <platform url="/game/xbox-one/destiny_">Xbox One</platform>
    <platform url="/browse/games/playstation-4/">PlayStation 4</platform>
  </platforms>
  <rating name="ESRB Rating">Teen</rating>
  <genres>
    <genre>Action</genre>
  </genres>
  <perspectives>
    <perspective>1st-Person*Perspective</perspective>
    <perspective>3rd-Person*Perspective</perspective>
  </perspectives>
  <subgenres>
    <subgenre type="Theme">Persistent*Universe</subgenre>
    <subgenre type="Theme">Post-Apocalyptic</subgenre>
    <subgenre type="Theme">Sci-Fi*/*Futuristic</subgenre>
    <subgenre type="Theme">Shooter</subgenre>
  </subgenres>
  <groups>
    <group url="/game-group/destiny-series">Destiny series</group>
    <group url="/game-group/japanese-playstation-3-game-releases-with-full-english-support">Japanese PlayStation 3 game releases with full English support</group>
    <group url="/game-group/japanese-playstation-4-game-releases-with-full-english-support">Japanese PlayStation 4 game releases with full English support</group>
    <group url="/game-group/middleware-cri">Middleware: CRI</group>
    <group url="/game-group/middleware-demonware">Middleware: DemonWare</group>
    <group url="/game-group/middleware-facefx">Middleware: FaceFX</group>
    <group url="/game-group/middleware-speedtree">Middleware: SpeedTree</group>
    <group url="/game-group/middleware-umbra-3">Middleware: Umbra 3</group>
    <group url="/game-group/middleware-wwise">Middleware: Wwise</group>
    <group url="/game-group/physics-engine-havok">Physics Engine: Havok</group>
    <group url="/game-group/setting-earths-moon">Setting: Earth's Moon</group>
    <group url="/game-group/setting-mars">Setting: Mars</group>
    <group url="/game-group/setting-venus">Setting: Venus</group>
  </groups>
  <description>Destiny is a futuristic first-person shooter set in a persistent world. When humans first set foot on Mars they discovered the Traveler, a giant sphere that then allowed mankind to colonize space quickly. Centuries later the Darkness, an enemy of the Traveler, arrived and waged war against it and all of its creations. The Traveler is forced to scarify itself as a measure of protection and what is left of humanity on Earth is forced to live in a city constructed underneath. The Traveler enlists warriors known as Guardians with special abilities as a personal army. These Guardians are guided by Ghosts, artificial intelligence. The player's character is found by such a Ghost (acting as a robot companion in the game) and becomes a Guardian.At the start the player chooses a class (Titan, Hunter or Warlock - corresponding with the classic Fighter, Rogue and Mage) and then customizes the appearance through race (Human, Awoken or Exo), gender, face, hair and marking. Each class has distinct abilities. The Titan is a fighter with an advanced mech suit, the Hunter is a bounty hunter with a knack for reconnaissance and ranged weapons, and the Warlock is a magical being that uses spells and explosives. Each class has various subclasses (unlocked at level 15) and specific abilities and statistics. None of the weapons in the game are restricted to a single class. The most distinct differences are the grenade types, jump abilities, melee attack and a special ability. The game borrows elements from the MMORPG genre in its design, with the Tower as a main hub zone to interact with NPCs, buy weapons, gear and Ghost upgrades, receive mail, buy and upgrade ships etc. Rewards for completing missions can also usually be collected here. The other major zones are Earth, the Moon, Mars and Venus, with some additional locations. While fighting, movement and jumps are important when confronting enemies. A Guardian can wield up to three main weapons (primary, special and heavy) next to grenades and other abilities. Only one weapon of each type can be equipped, but nine more for each type can be stored in an inventory to swap them.Ships are used to travel through space to other planets. These voyages are not an interactive scene however as the game does not have a complete open world design. Each planet has its own story and various quests. Players complete missions by defeating enemies, collecting loot. Glimmer, the main currency is used to buy upgrades, vehicles, emblems ... The character also gains experience and levels up, accessing new abilities. From level 18 Vanguard Marks are unlocked as an additional currency for high-end gear, along with Vanguard Reputation as a reward for completing bounties. There are daily and weekly challenges to earn Marks, but the amount that can be earned has time constrictions. From level 20 the general experience system is replaced by Mote of Light. At that level the characters also gets to join one of three factions: Dead Orbit, New Monarchy or Future War Cult. The main story consists of different campaign missions that need to be completed for The Speaker, unlocked gradually and opened up on Earth, Moon, Venus and Mars respectively. Once unlocked however, they can be completed in any order. Next to the mission progression it is possible to enter a planet in exploration mode, offering free exploration. Each planet generally has five major story missions. These can be completed alone or in groups of six players working together.Players can also optionally organize themselves in Fireteams for Strike missions, similar to MMORPG dungeons, that are harder with various mini-bosses and bosses. Fireteams are organized through online matchmaking and it is possible to make them public or private, and invite players. Only in Fireteams is voice chat possible. Outside of it communication is limited to gestures and prompts. Players also do not see all other online players at that location, only a selection. For the instance missions, the player's character (and possibly other team members) are drawn in a private part in the game, cut off from the open game world. In these section characters can revive each other and there are checkpoints for respawns. Loot is personalized and does not need to be distributed between players. Each planet contains five major loot chests. Certain weapons and armour have a ranking system where they become better over time. There are also various types of engrams, a type of pattern for weapons or amour that can be turned into a special item with colour coding for the rarity level. Decoding them improves a separate ranking system with leveling to access additional engrams. When dead, other players can attempt to revive the player's character with its Ghost nearby or eventually it respawns. Moving around on planets can be done on speeder bikes.There are also Raids, much harder cooperative missions added gradually by the developer that do not support matchmaking and can only be taken on with players at a high level with high-end gear. Unlike Strike missions, no objectives or directions are given, players have to figure it out for themselves. Raids can be paused and resumed, unlike most of the rest of the game, but are reset after a weekly cycle. Competitive multiplayer is available in the The Crucible PvP game mode in a separate location. It offers game modes unlocked gradually including Clash, Control, Skirmish and Rumble. They come with different team sizes and include team deathmatch, capturing zones, team modes with vehicles and more. Level differences between players are nullified except for the skill tree unlocks, but the strength of the gear, weapons and equipped items makes a significant difference. The Crucible offers Crucible Marks as a currency along with Crucible Reputation.</description>
</game>

The Asterisks (Bungie,*LLC) in the game info do not appear in the actual xml files. Not sure why the forum adds them.


Getting Close....
 

dustind900

Member
Supporter
RL Member
Ratings Info
Code:
<?xml version="1.0" encoding="utf-8"?>
<game baseurl="http://www.mobygames.com" url="/game/playstation-4/destiny_/rating-systems" id="destiny_" name="Destiny">
  <system url="/browse/games/playstation-4/">PlayStation 4</system>
  <ratings>
    <rating name="ESRB Rating" value="Teen">
      <descriptors>Animated Blood, Violence</descriptors>
    </rating>
    <rating name="PEGI Rating" value="16">
      <descriptors>Violence</descriptors>
    </rating>
    <rating name="OFLC (Australia) Rating" value="M">
      <descriptors>Fantasy Violence, Online Interactivity</descriptors>
    </rating>
  </ratings>
</game>
 

dustind900

Member
Supporter
RL Member
I could use some help naming some of the xml elements so that the info they contain is easily understood.

Example:
Code:
<element1>
  <element2>
    <publisher>value</publisher>
    <developer>value</developer>
    <element3>
      <countries>
        <country>value</country>
        <country>value</country>
      </countries>
      <releasedate>value</releasedate>
      <serials>
        <serial name="value">value</serial>
        <serial name="value">value</serial>
      </serials>
      <comments>value</comments>
    </element3>
  </element2>
  <element2>
    <publisher>value</publisher>
    <developer>value</developer>
    <element3>
      <countries>
        <country>value</country>
        <country>value</country>
      </countries>
      <releasedate>value</releasedate>
      <serials>
        <serial name="value">value</serial>
        <serial name="value">value</serial>
      </serials>
      <comments>value</comments>
    </element3>
  </element2>
</element1>

Based on data from this page and the example above, I need a name for element1, element2, and element3.
 

brolly

Administrator
Developer
That structure doesn't seem to follow the same one as the moby page, perhaps fill it with actual data so we can understand it better.

Still, for the ones I did understand I suggest:
element1=regionalreleases
element2=regionalrelease
element3=here is where I don't understand your structure when compared to the HTML page
 
Top