You should be able to get an empty database built from source by the supplier, failing that generating scripts directly from the database and running them on a new blank one, as Kin suggests, will be required.
Getting a limited but useful amount of data is going to be more complicated than simply taking X,000 rows of Y% of each table: much of the data will end up not linking together correctly - in fact if the database is properly setup with referential integrity enforced by foreign key constraints this simplistic process will simply fail. To create a subset of the data you need to work with the structure of the data: for instance with out training records system you might extract 10 teams of people out of the 100s, then extract their training records, then the audit trail records associated with them, and so forth - that way everything you have links together as real data instead of being arbitrary combinations of rows that might not related to each other.
Also be careful when taking copies of real data for testing/development purposes: you may be in breach of data protection rules/regulations/laws (or the owner of the database you are taking a partial copy of may be in breach of those rules/regulations/laws by allowing you access). At very least you will probably need to randomise any personally identifying or otherwise sensitive data.
A better solution is to generate this data rather than copying it from the production database. This way you can control the data size fairly directly to match your needs and can engineer in all the potential oddities that you want for testing purposes to make sure your changes don't introduce regressions in dealing with edge cases. It also means that you do not need to worry about data protection issues as you are not dealing with real records about real people.
See this answer for another discussion of manufacturing test data, the benefits, and issues.