Hi,
I only have experience with ATMega32u4.
1. if you can use the MC directly, that would be better.
reasons: (in my opinion)
- it would take less space
- Teensy2.0 would interfere with other components (switches, for example)
- easier for you to arrange pins usage (instead of using fixed pins on the Teensy)
2. soldering small SMD components is hard, it requires proper tools and practices.
about the raw chip:
after soldering required components, you must "program" bootloader into it.
On the design, you must make program pins accessible, check this example:

connect these pins to a "programmer" and use software to program bootloader into your MCU (the ATMega32u4, which is now soldered on your PCB).
cheapest option for a "programmer" is an Arduino ProMicro (which I'm currently using) and the software is
avrdudess (you also need appropriate hex file to program the bootloader)
after you have programmed the bootloader, you can flash keymap into your keyboard via USB with
qmk toolbox. (you need hex keymap file for this task)