In recent years, transfer learning in natural language processing has been dominated by incredibly large models following a pretraining-finetuning approach. A problem with these models is that the increasing model size goes hand in hand with increasing training costs. In this work, we instead evaluate the transfer learning capabilities of the recently introduced knowledge-infused representations. Previously, these infused representations have been shown to infuse experts' knowledge into a downstream model when the expert and downstream model operates on the same task domain. We extend this by investigating the effects of different expert task configurations on the performance of the downstream model.Our results show that differing expert and downstream tasks do not affect the downstream model. This indicates a desired robustness of the model towards adding irrelevant information. Simultaneously, the ability to transport important information is retained as we continue to see a significant performance improvement when adding two experts of differing tasks. Overall, this solidifies the potential knowledge-infused representations have regarding the ability to generalize across different tasks and their ability to recycle old computations for smaller new downstream models. |
*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.